MATHEM, Tie 


CANADIAN 
JOURNAL OF MATHEMATICS 


Journal Canadien de Mathématiques 


VOLUME I 
1949 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 


by the University of Toronto Press 











EDITORIAL BOARD 
H. S. M. Coxeter, A. Gauthier, L. Infeld, R. D. James, R. L. Jeffery, 
G. de B. Robinson 
with the co-operation of 


R. Brauer, J. Chapelon, D.B.DeLury, P. Dubreil, I. Halperin, 
W. V. D. Hodge, S. MacLane, L. J. Mordell, G. Pall, J. L. Synge, 
A. W. Tucker, W. J. Webber 


MM rier et 











INDEX OF VOLUME I 


Birkhoff, G. and Burton, L. Note on Newtonian force-fields, 199 

Bruck, R. H. and Ryser, H. J. The nonexistence of certain finite pro- 
jective planes, 88 

Burton, L. See Birkhoff and Burton 


Busemann, H. Angular measure and integral curvature, 279 
Chowla, S. D. and Todd, J. The derisity of reducible integers, 297 


Davenport, H. and Pélya,G. On the product of two power series, 1 
Duff, G. F.D. Factorization ladders and eigenfunctions, 379 


Einstein, A. and Infeld, L. On the motion of particles in general 
relativity theory, 209 
Ellis, H.W. Mean-continuous integrals, 113 


Fenchel, W. On conjugate convex functions, 73 
Frame, J.S. Congruence relations between the traces of matrix powers, 


Frucht, R. Graphs of degree three with a given abstract group, 365 
Hall, M. Subgroups of finite index in free groups, 187 

Infeld, L. See Einstein and Infeld 

Kaplansky, I. Groups with representations of bounded degree, 105 


Lalan, V. Sur les surfaces 4 courbure moyenne isotherme, 6 
Lorentz, G. G. Direct theorems on methods of summability, 305 


MacDuffee, C. C. Orthogonal matrices in four-space, 69 

Mahler, K. On the critical lattices of arbitrary point sets, 78 

———— Ona theorem of Liouville in fields of positive characteristic, 397 

Mendelsohn, N.S. Applications of combinatorial formulae to general- 
izations of Wilson’s Theorem, 328 

Menger, K. Generalized vector spaces. I. The structure of finite- 
dimensional spaces, 94 

Minakshisundaram, $. A generalization of Epstein zeta functions (With 
supplementary note by H. Weyl), 320 

———— and Pleijel, A. Some properties of the eigenfunctions of the 
Laplace operator on Riemannian manifolds, 242 

Morse, M. and Transue, W. Functionals of bounded Fréchet variation, 


153 





Pall, G. , Representation by quadratic forms, 344 
Pleijel, A. See Minakshisundaram and Pleijel 
Pélya,G. See Davenport and Pélya 


Rado, R. Axiomatic treatment of rank in infinite sets, 337 

Robinson, G. de B. On the disjoint product of irreducible representations 
of the symmetric group, 166 

Ryser, H. J. See Bruck and Ryser 


Schild, A. Discrete space-time and integral Lorentz transformations, 29 
Snapper, E. Completely indecomposable modules, 125 

Stone, M. H. Boundedness properties in function-lattices, 176 

Synge, J. L. On the motion of three vortices, 257 


Taussky, O. On a theorem of Latimer and MacDuffee, 300 

Titchmarsh, E.C. On the uniqueness of the Green’s function associated 
with a second-order differential equation, 191 

Todd, J. See Chowla and Todd 

Transue, W. See Morse and Transue 

Turnbull, H.W. Note upon the generalized Cayleyan operator, 48 


Weinstein, A. On surface waves, 271 

Weyl, H. Elementary algebraic treatment of the quantum mechanical 
symmetry problem, 57 

—————Supplementary note to a paper by Minakshisundaram, 326 








e 





Jaw 13 1949 


OURNAL OF MATHEMATICS 


Journal Canadien de Mathématiques 


VOL. I- NO. I 
1949 


On the product of two power series | H. Davenport and G. Pélya 
Sur les surfaces 4 courbure moyenne isotherme Victor Lalan 


Discrete space-time and integral Lorentz 
transformations Alfred Schild 


Note upon the generalized Cayleyan operator H. W. Turnbull 


Elementary algebraic treatment of the quantum 
mechanical symmetry problem Hermann Weyl 


Orthogonal matrices in four-space C. C. MacDuffee 
On conjugate convex functions W. Fenchel 
On the critical lattices of arbitrary point sets K. Mahler 


The nonexistence of certain finite 
projective planes R. H. Bruck and H. J. Ryser 


Generalized vector spaces. I. The structure of 
finite-dimens‘onal spaces Karl Menger 


Groups with representations of bounded degree _— Irving Kaplansky 105 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 
by the University of Toronto Press 





EDITORIAL BOARD 


H. S. M. Coxeter, A. Gauthier, L. Infeld, R. D. James, R. L. Jeffery, 
G. de B. Robinson 


with the co-operation of 


R. Brauer, J.Chapelon, D.B.DeLury, P.Dubreil, 1. Halperin, 
W. V. D. Hodge, S. MacLane, L. J. Mordell, G. Pall, J. L. Synge, 
A.Tucker, W. J. Webber 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, H. S. M. Coxeter, University of Toronto. Every paper 
should contain an introduction summarizing the results as far as possible 
in such a way as to be understood by the non-expert. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers is 
$6.00. This is reduced to $3.00 for individuals who are members of 
the following Societies: 


Canadian Mathematical Congress 
American Mathematical Society 
Mathematical Association of America 
London Mathematical Society 

Société Mathématique de France 


The Canadian Mathematical Congress gratefully acknowledges 
the assistance of the following towards the cost of publishing this 
Journal: 


University of British Columbia 
University of Manitoba McGill University 
Queen’s University University of Toronto 
and 


The American Mathematical Society 











te? 





ON THE PRODUCT OF TWO POWER SERIES 


H. DAVENPORT AND G. POLYA 


WE consider the product of two power series with positive coefficients: 
(Zunx")(Zvnx") = Twpx”. 

What conditions will ensure that the coefficients w, shall be either (i) mono- 
tonic, or (ii) logarithmically convex? By the latter, we mean that w,’?< 
WaiWai for nm = 1, 2,.... In investigating this question, which was sug- 
gested by a special example, we have found it convenient to express the con- 
ditions in terms of the ratios of u,, v, to certain binomial coefficients, rather 
than in terms of u,, v, themselves. 

We introduce a and @ such that 
(1) a>0, 6>0, a+fs=1 
and let 








a(a+1)...(a+n — 1) _ BB+ 1)...6@ +n —1) 
ee » B= Saleset 
for n> 1; ao= Bo= 1. Let 
an = n/n, - Un/Bn 

so that a, and 5b, are positive, and 
(3) Wa= AoboBabat 410:8n—wdn-it. . + andnBodo. 
We have been led to the following very elementary results, which appear, 
however, to be new. 

THEOREM 1. Jf a, and b, are both monotonic increasing, so is Wp», and if ay 
and b, are both monotonic decreasing, so is Wn. 

THEOREM 2. If a, and b, are both logarithmically convex, so is Wp. 

We prove these theorems in 1 and 2, and add some general remarks con- 
cerning them in 3. In 4we apply them to the special example from which our 
investigation started. In 5 we mention the integral analogues. 


(2) an 


1. The proof of Theorem 1 may be decomposed into two steps, the first 
of which is concerned only with properties of the binomial coefficients. 

Put 
(4) a aBn, Pi= a:8n-1,... Pa= anBo 

Go= GBnit, Gi= aiBn,... Gn+i= On+1B0. 

Then we assert that 
(5) Dot Pit. .-t+ Pa= Got Git. -- + dnii= 1, 
and 
(6) go< po< got Gi< Pot Pi<... < Got Git... + On< Pot Pit... Pn. 
Thus we assert that the successive partial sums of the two sequences po, pi, . . . 
and qo, @1, . . . separate each other. If we imagine each sequence represented 


Received January 29, 1948. 
1 











2 H. DAVENPORT AND G. POLYA 


by a row of blocks, the two rows will have a form similar to that of two neigh- 
bouring rows of tiles in a wall, and we can express the property in question by 
saying that the two sequences are “‘tilewise ordered.” 

Of the two results (5) and (6), the former is immediate, since, by (2), 


> anx"=(1 — x)~*, = Bax" = (1 — x)” 
0 0 


and so, by (1), 
Zz (abut. ..+ anBo)x*=(1 — x)= = x". 
0 0 


To prove (6), we observe that, by (1) and (2), the a, and 8, are monotonic 
decreasing, whence 
Got Git. --+ Ge= @oBn4it aiBat. ..+ arBn41—x 
< aoBat aiBa—-it...+ arBa—~ 
Pot pit. ..t+ pe. 


Similarly 
Qn4it Qnt+. oot Qk+1= Gn+180+ anfit...+ 04418 n—k 
< anBot an—i8it. ..+ arBn—x 
= Pat Prit...+ pe 
In view of (5), this implies that 
Got Git. .-+ ge > Pot pit. ..+ Pe» 
and the proof of (6) is complete. 
For the second step in the proof of Theorem 1, we introduce symbols for the 
successive differences of the terms in (6). We put 
To= Qo, P'o= Po— Qo, 11= (Got Gi) — Po, 1'1=(Pot Pi) — (Got qi), --- 
n= (got. .-+ Ga) —(Pot. . -+ Pa—1), n= Qn+1- 
All these numbers are positive, and we have 
Po= fot 1'o, Pi= Tit 1's, .- +5 Pa= Tat Wns 
Go= To, Qi= ot T1,--- 5 n= Mn—it Try Qngi= Pn: 
Hence, by (3) and (4), 
Wn= Toda t 1 dobat Tidibait. ..+ Tndndot 1'ndnbo, 
Wa+i= rQoPnsit W' Qybat TrQibe+. . + Tndnbit 1’ nOn+rDo. 


These expressions render Theorem 1 immediate, on comparison of corres- 
ponding terms. 


2. To prove Theorem 2, we use the following lemma: 
LemMA. Let W,, be defined by 


(7) Wa= dobat (1) abe + (3) baat. . .+ ando. 


Then, if a, and b, are positive and logarithmically convex, so is Wn. 
Proof. The desired result W,?S Wa-1W2+: holds for n = 1 since 
WoW2— W2= aobo(aob2+ 2a1b1+ a2bo) — (Gobi t+ @ybo)* 
= a¢(bob2— by") + be?(aoa2— a;*) > O. 
We prove it for general n by induction. 





Su 


Si 


je, 


— 





THE PRODUCT OF TWO POWER SERIES 3 


By the well-known property 


n s— 1 n—1 
hath ba ee 
of the binomial coefficients, we have, for n > 1, 
W,= Wnt WwW" »—1s 
where W’,_, is formed with the sequences a, d2,.. . and do, b;,. ..and W"’»_, 


is formed with the sequences do, a, ... and b;, b2,.... By the hypothesis of 
the induction, applied to the two former sequences, we have 
(W'n—1)? S W'n—-2W'n, 
and similarly 
(W"'n-1)? S Wn W" 
By the inequality of the arithmetic and geometric means, it follows that 
2 WW na S 2 {Wn 2W' Wn 2W" a}! SW Wn + Wn Wn. 
Hence, using again the hypothesis of the induction, we obtain 
Wi? = (W'naa + Wn)? S W'n-2W'n + W' nw" 
+ WW" n + Wn eW nn = Wai Wry. 
This proves the Lemma. 


An immediate corollary to the Lemma is that the same conclusion holds for 
W,.(A, #) defined by 


(8) W,(A, hu) = Qobnu” + (7) ab." + “+. + Andrd"bo, 


where \, uw are any two positive numbers. 
We can now prove Theorem 2 as follows. By (1) and (2), we have 


- I'(a + m)T(6 +n — m) 
m n!T(a)T(8) 


- (*) 1 I "dente | ~ @re-?-"6. 
mf} Y(a)T(s) Jo , 


Substituting in (3), and using the notation of (8), we obtain 





mced 





- ; 1 p—1 
= a a ie = , a 
a ™ T(a) P) [ie (1 — #)"~"W,(t, 1 — 2) dt. 


Since W,(t, 1 — #) is logarithmically convex for each ¢, it follows from the 
inequality of Schwarz that w, is, since 


T(a)P'(B)wn < [ie — 1°" {Wa-alt, 1 — )Wasilt, 1 — 0}! dt 
0 


1 i 
| ca — )*'w,-1(t, 1 — nar’ 
0 
i 
{fie — th" Walt, 1—- nat’ 
0 J 


(T'(a)P'(B)wn—al (a)T'(8)wn41)? . 
This proves Theorem 2. 


lA 











4 H. DAVENPORT AND G. POLYA 


3. The two theorems proved above have a certain resemblance to the fol- 
lowing simple but useful theorem of Kaluza.' 
If the a, are positive and logarithmically convex, and 
(ao + ayx + aox? + ...)7 = by — bux —_ box? — eoees 
then all the b, are positive. 

All three theorems give conditions which ensure that a power series, derived 
from given power series by multiplication or division, shall have some 
simple property. 

There is one class of power series to which our theorems can readily be 
applied. Suppose ¢(¢) is positive and integrable in the interval (0, h), and let 


(9) [ieoa — xt)~*dt = Landnx". 
0 
Then 


A 
an= | o(t)i*dt, 
0 


and the a,, being the successive moments of a positive function, are logarith- 
mically convex. 


4. The particular problem from which our investigation started was that 
of showing that 


(10) [ [i +a 2xut)-Ydu |” + ik + ut 2xut)tau |” 
0 0 


decreases steadily as x increases from 0 to 1. 

(It can be shown that the expression (10) represents (27,A/2)*, where ro de- 
notes the inner conformal radius of a rectangle with respect to its centre, and 
A denotes the principal frequency of vibration of a membrane with the rect- 
angle as its boundary. The product roA depends on the shape but not on the 
size of the rectangle, and the parameter x specifies this shape. As x increases 
from 0 to 1, the ratio of the two sides of the rectangle increases steadily from 
1 to infinity. Our assertion concerning (10) means that the product roA de- 
creases steadily in this process.) 

By the change of variable 

2u?/(1 + u*) =t 
the first integral in (10) is transformed into an integral I(x) of the type (9), 
with kh = 1 and a = 1/2. Theorem 2, applied to this integral, tells us that 
the coefficients of the power series for I?(x) are logarithmically convex. From 
Kaluza’s theorem, it follows that the expression (10) has the form 
2bo— hex? — Zhyx*—... 
with positive 5,. This obviously decreases as x increases. 

We should perhaps observe that instead of using Theorem 2 in the above 

argument, we can use the following ad hoc argument. We have 


I*(x) = I. [. o(t)o(t’)(1 — xt)—*(1 — xt’)—* didt’. 


1 Math. Zeit., vol. 28 (1928), 161-170. 





Let 


the 


wh 


If 1 


bo 


if 


| 


S 











THE PRODUCT OF TWO POWER SERIES 


Let 
(1 — xt)“*(1 — xt’)? = © Anlt, tx"; 
then 
I*(x) = 2 Cax*, 
where 


Cc. = . I. o(t)o(t’)A alt, t’)ditdt’. 


If we prove that A,(t, t’) is logarithmically convex, fcr fixed ¢, ¢’, it will follow 
that c, is logarithmically convex, as desired. In fact, it is easily seen that 


: 
A, (t,t) = |’ (t cos*é + ?’ sin*@)"dé, 


and this is obviously logarithmically convex. 
5. For the sake of completeness, we mention the integral analogues of 
Theorems 1 and 2, although they are less interesting. 


Suppose that f(x) and g(x) are positive and integrable for x 2 0, and 
bounded in any finite interval. We retain (1) and put 


h(x) = I. —f()(x — 1’ g(x — t)dt. 


THEOREM 3. If f(x) and g(x) are both monotonic increasing, so is h(x), and 
if f(x) and g(x) are both monotonic decreasing, so is h(x). 

THEOREM 4. If f(x) and g(x) are both logarithmically convex, so is h(x). 

We say that f(x) is logarithmically convex, if for x 2 d > 0, 


P(x) Ss f(x — df(x + d). 
By changing the variable of integration and using (1), we obtain 


h(x) = I u*—"(1 — u)®" f(ux) g((1 — u)x) du, 


and this representation of h(x) renders Theorem 3 obvious. By the hypothesis 
of Theorem 4 and Schwarz’s inequality, 


h(x) < [wa — u)P-"{ f(ulx — d]) f(ulx + d])}* 
- {g({1 — ulfx — d]) g((l — ul[x + d])}* du 
1 i 
< 1 u(1 =u)" ful — dl) g({ — ulle - ayaut 


1 4 
4y' u*—*(1 — u)*—" f(ulx + dj) g({1 — ul[x + ayau 
= {h(x — d) h(x + d)}*. 


This proves Theorem 4. 


University College, London 
Stanford University 








SUR LES SURFACES A COURBURE MOYENNE 
ISOTHERME 


VICTOR LALAN 


Nous nous occupons dans ce travail d’une classe de surfaces dont |'équation 
différentielle est du cinquiéme ordre, et qui jouissent de la propriété que, sur 
elles, les lignes d’égale courbure moyenne forment, avec leurs trajectoires 
orthogonales, un systéme isotherme. Dans cette classe rentrent les surfaces 
admettant un groupe de déplacements 4 un paramétre (cylindres, surfaces de 
révolution et hélicoides), ainsi que les surfaces admettant une infinité de 
déformations avec conservation des courbures principales (surfaces d'Ossian 
Bonnet). 

Notre méthode repose essentiellement sur l'emploi de certaines formes dif- 
férentielles qui se sont présentées 4 nous dans I’étude des lignes minima des 
surfaces, mais qui sont susceptibles aussi d’une définition simple dans le 
domaine réel, comme nous le montrons au n° 1. On pourra consulter a ce 
sujet diverses Notes que nous avons communiquées a |’ Académie des Sciences, ' 
et aussi un mémoire qui doit paraitre dans le Bulletin de la Société Mathé- 
matique de France de 1947. Nous supposons le lecteur initié aux méthodes 
développées par M. E. Cartan dans ses divers ouvrages.” 


I. D&FINITION ET PROPRIETES GENERALES 


1. Nous appelons surfaces a courbure moyenne isotherme, ou, plus briéve- 
ment, surfaces HI, les surfaces sur lesquelles les lignes d’égale courbure moyenne 
forment, conjointement avec leurs trajectoires orthogonales, un systéme iso- 
therme. Leur étude, qui ne semble pas avoir été systématiquement entreprise, 
se trouve grandement facilitée par l'emploi de certaines formes différentielles, 
que nous avons appelées les formes minima de la surface, et qui ne sont autres 
que les différentielles des pseudo-arcs des lignes de longueur nulle de la surface. 
On peut d’ailleurs définir ces formes sans faire appel a la théorie des lignes 
minima, en opérant comme suit. 

Si ds* et @ désignent les deux formes quadratiques de la surface, a et c 


les courbures principales, w; et mw: les arcs élémentaires des lignes de 
courbure, on a 


d*§=oar+o%, ¢=avr’ + co?, 
donc 


@ — cd? =(a—c)w’, $ — ads*=(c — a)o?’. 


Received February 11, 1948. 
=a ap ye 1946, 1947, 1948, passim. 

— finis et la géométric différentielle (Paris, 1937). Les systémes différen- 
tiels aie a et oo. ap —, géométriques (Paris, 1945). Voir aussi, sur les surfaces 
d’O. Bonnet, un mémoire Cartan paru dans le Bull. des Sciences Math., vol. 66 (1942), 
55-85, od l’on trouvera des indications bibliographiques. 


6 





2 Oo 


tn ft lm 





SURFACES A COURBURE MOYENNE ISOTHERME 7 


Supposons a > c, ce qui est loisible en tout point qui n’est pas un ombilic, et 
posons 








(1) 6, = Vo — cdst = Va — cm, 0: = Vads — 6 = Va — cor. 
Les formes minimz, w; et we, se définissent a partir de 1a 
(2) 2u;= 6,-— 102, 2u2= 6+ 10>. 


Il est donc parfaitement équivalent d’utiliser les formes réelles 6,, 42, que nous 
appellerons les formes principales, ou les formes imaginaires conjuguées w, ws. 
En revanche, nombre de propriétés apparaissent quand on emploie 6, et 6, 
(ou w; et ws), qui restent cachées tant qu’on s’en tient aux formes de Darboux- 
Cartan wm; et m2. 

Nous poserons 

a+c @a-c 
H = 2° A= 2 

H est la courbure moyenne, A sera appelée l’asphéricité; on suppose la région 
étudiée dépourvue de singularités et d’ombilics, et l'on s’arrange pour que A 
soit positive. 

Les formes quadratiques de la surface s’écrivent, comme on le voit immé- 
diatement, 











(3) a re + wf 
A ’ A 1~?2 2- 

Définissons en outre les invariants minima r et s par 

(4) dw, = rlwyw2], dw = slwow]) 

et les invariants principaux p et ¢ par 

(5) d0; = p[6:62], 0. = 0620). 

On vérifie sans peine que 

(6) r=o-—ip, s=at+ip. 


2. Le premier résu!tat important que nous obtenons grace a l'emploi des 
formes principales, c’est l’expression des équations classiques de Codazzi sous 
la forme d'une équation unique aux différentielles totales. 

Dans la méthode du triédre mobile de M. E. Cartan, on a 

V1.2 = ho, + ko, 
avec 
(7) dw, = hlwia2], dw: = klwia:!] 
et 

@Wi3 = @Wi, W23 = CW. 

Or, deux des conditions d’intégrabilité s’écrivent 
(8) a2=h(a—c), = k(a —c). 
Dans ces formules, les indices 1 et 2 qui affectent c et a désignent des dérivées 
relatives A m1, w2, suivant le schéma 


df = fim + frw>. 











8 VICTOR LALAN 


Introduisons pareillement les dérivées 7, f, T:f relatives A 6, 62, vérifiant 
df = T, fa; + T2fbs, 
de sorte que, compte tenu de (1), 


— il 
- adil lea es 


La différentiation extérieure de (1) donne 


1 cs 
(10) ye (ee $+ ae), 





d’ot, en résolvant, 


a= — k(a —c)— 2a — oe, Co= — h(a — c)+ Aa — o)*p. 
Par conséquent, en tenant compte de (8), 
(11) Tifa + c)= —2a-—c)e, T:(a + c)= 2a —c)p, 
et, puisque a + c = 2H, les deux équations de Codazzi se condensent, comme 
annoncé, en une équation aux différentielles totales, 


(12) dH = 2A(— o6;+ pb2), 
que nous utiliserons plut6t sous la forme équivalente, 
(13) dH = — 2A(rwit+ swe). 


3. Supposons que les lignes H = C forment une famille de courbes iso- 
thermes sur la surface. II existe, en conséquence, des paramétres isotropes 
u, v, respectivement intégrales premiéres de w;, w2, tels que l’on ait H = f(u+). 
Nous prenons u et v imaginaires conjuguées, si bien que u + v est une fonction 
harmonique réelle sur la surface. La variable complexe »v sera dite attachée aux 
courbes H = C, en ce sens que les courbes H = C s’obtiennent en égalant a 
une constante la partie réelle dev. Puisque H = f(u + 2), l’équation dH = 0, 
équivalente d’aprés (13) a rw:+ sw2= 0, doit étre équivalente a du + dv = 0. 
Or, posons w,;= a(u, v)du, w2= B(u, v)dv, nous aurons, d’aprés (4), 


ll 


os S _ — x 
aes ap’ af 
et 
ay By 
ra, + Sw. = — —du — —db; 
B a 


cette derniére expression ne contiendra du + dv en facteur que si 
ay Bu 
oa ou aa, = 66,, 


B 


c’est-a-dire, si la forme a*du + 6*dv est une différentielle exacte dy, si, par 
conséquent, a = Vy,, 8 = Vy», et 








| 
| 
| 
| 








-_— -_ ;rpoere Ca re ee eC ST Cs Se eee ee CO 





SURFACES A COURBURE MOYENNE ISOTHERME 9 


(14) wr =Vyudu, w.=Vy¥,d, 
d’ot cette proposition: si les lignes d’égale courbure moyenne d'une surface for- 
ment une famille de courbes isothermes, il existe des coordonnées isotropes u, v, et 
une fonction ¥(u, v) telles que les formes minima s’écrivent w= Vu du, w= 
V ve dv, tandis que la courbure moyenne est H = f(u + 2). 

Nous appellerons ¥(u, v) la fonction primitive de la surface HI, pour rappeler 
que les formes minima d’une telle surface se déduisent de y par des dérivations. 


4. Désignons par ¢; la forme rw;+ sw, et par de, i(rwi— swe), les lignes 
¢:1= 0, ¢2= 0 sont orthogonales, car 


(15) o:2 + 3? = (2Ars) (4) ; 


et le second facteur du second membre est |’élément linéaire de la surface 
(formule (3)). Posons ensuite 


(16) d¢; = R[(¢: $11 , doz = S[¢2¢:) . 
La forme 
(17) x = Soit Roo 


est importante a considérer, comme novs allons le voir; elle s’écrit, en tenant 
compte des formules 
rP+s*—s,—Po rP—st?+s5,—fo 
18 S= , R=it , 
(18) 2rs 2rs 
qui s’obtiennent par différentiation extérieure, 


(19) x = (s om st) wit (; —_ “*) We. 
5 r 


(Dans les deux formules précédentes, les indices 1 et 2 affectant r et s désignent 
des dérivées relatives A w; et we; il en sera de méme par la suite.) 
Si H = f(u + v), ce qui entraine (14), on trouve que x est une différentielle 


exacte, A savoir 
y uVeo 
(20) x = leg Oe. 








Réciproquement, si x est une différentielle exacte, e/* est facteur intégrant 
a la fois pour ¢; et ¢2, car on vérifie sans peine que 
d(e/*g:) = 0, d(e*¢:) = 0. 
On peut donc poser 
eo, = dp, eo. = dq, 

et, par suite, l’élément linéaire de la surface devient, d’aprés (15), 

(dp? + ad) 

2Ars ; 
ce qui montre que les lignes ¢:= 0, ¢2= 0, c’est-d-dire les lignes H = C et 
leurs trajectoires orthogonales, forment un systéme isotherme. D’od la pro- 
position: la condition nécessaire et suffisante pour qu'une surface soit a courbure 
moyenne isotherme (surface HI), c'est que la forme x, de la formule (19), soit une 
différentielle exacte. 














10 VICTOR LALAN 


Les formules spéciales aux surfaces HI sont résumées ci-dessous: 
f 1 =Vy, du, We = Vy, do, 
Yur _ Vu 
WV 
(21) 4 Yur Vue 


= ¢ = —-— = (dut+ dv), x =d 
Twit swr= di veut v), x log we 


H=flut+s), A= V¥ebem 





(u+v). 





. 


5. Les deux — quadratiques d’une surface ol s’écrivent 








8 = 
(22) ds? = ee . 9) (oe + dv), ¢ = ¥ydu?+ of yr 
Elles ne contiennent que les deux fonctions ¥(u,v) et f(u+v). Ces deux 
fonctions sont d’ailleurs liées par la 3° condition d’intégrabilité, qui traduit le 
theorema egregium de Gauss. Quand on écrit les formes 
ds? = 2 F dudv, @ = Ldu? + 2 Mdudv + Ndr’, 
ce théoréme s exprime par 
2 


1 
=_l ee = 3... 2 d 
K F duav (log F) (K = H*?—A? courbure totale). 
Cette formule devicat ici, cungae tenu des expressions de H, A et F, 


? uyo 
(23) 7a 08 IF Ol= > av Yael + Yue os — Sa (¢=u+). 


On remarquera “ue, sur une surface réelle, ¥,, et f’(u + v) sont de méme signe, 
d’aprés la premiére équation (22), mais ce signe peut étre quelconque. 

Dans ce qui précéde, nous avons supposé implicitement que ¥(u, v) n’était 
pas une fonction harmonique sur la surface; au cas contraire, on aurait 
¥=U4+/, w= V U'du, o2= V V'de, @, et we seraient des différentielles 
exactes, d’oil, d’aprés (4), r et s seraient nuls, et, d’aprés (13), H serait con- 
stant. Quand nous parlerons de surfaces HJ, nous supposerons toujours que H 


n'est pas constant, et, partant, que ¥ n’est pas une fonction harmonique sur 
la surface. 











6. La fonction primitive y (u,v) n’est pas invariante, car la fonction har- 
monique ¢ = u +, qui intervient dans sa définition, n’est pas définie uni- 
voquement; elle n’est assujettie qu’a la condition d’étre constante sur les lignes 
H =C. On peut la remplacer par ¢ = at + b (a, b, constantes). Dans ce 


di? di? 
changement, y,du?, qui est w;’, devient ayz . ow 3a , ce qui peut bien 
—— , = » 2 —_ 
s'écrire ¥,da@*, mais 4 condition de poser ¥ = a Ainsi, quand on remplace 


' v — 
t par at + 5, y doit étre remplacé par rt de sorte que dt reste invariant. 


| 
| 
| 
| 
| 
| 
| 





— ee 








oe 








SURFACES A COURBURE MOYENNE ISOTHERME 11 


Les courbes primitives, (u,v) = C, de la surface HJ, sont dans une relation 
remarquable avec les courbes H = C et les lignes de courbure. De leur équation 


différentielle, y.du + y.dv = 0, qui s’écrit aussi V voit V bees = 0, on dé- 
duit qu’elles coupent les premiéres lignes de courbure sous un angle a donné par 
Vu 

Vie’ 

or, les lignes H = C ont pour équation rw;+ sw:= 0, c’est-a-dire, V vedu + 


Vv v.do = 0; elles coupent donc les premiéres lignes de courbure sous un angle 
8 qui vérifie 


ie 





et an Vive —2ia 
Vvy . 
donc 8 = — a, et nous avons la proposition: sur une surface HI, les lignes 


primitives et les lignes d’égale courbure moyenne sont bissectées par les lignes de 
courbure. 


7. La proposition énoncée au n° 4 nous permet d’obtenir sous forme in- 
variante |’équation différentielle des surfaces HI. La condition que la forme 
x soit une différentielle exacte s’exprime en effet par une relation du 5° ordre, 
qui n’est autre que l’équation cherchée. Elle s’écrit 

i. = [ (log r)a — s(log r)s] = $3 — [ (log S)i2 — r(log s)s] ‘ 
Introduisons le second paramétre différentiel de Beltrami, dont l’expression, 
pour une fonction f quelconque, est 

Asf = 2A(fe: — sf2) , et aussi, 2A(fi2 — rf) . 
L’équation précédente devient 


Ax(logr) _ A2(log s) 
Jee Y ii ne Y 
ou enfin 
(24) As (log ‘) = 2A(r: — 52), 
AY 


qui est, en définitive, l’équation des surfaces H7. On peut du reste la formuler 
autrement. 

6 étant toujours l’angle sous lequel les courbes H = C coupent les premiéres 
lignes de courbure, on a 


donc 


> 
—_=-” 
i= 
Ree” 
ll 
to 
= 
ob 
% 








12 VICTOR LALAN' 


Par ailleurs, la forme sw;+rw2 a pour différentielle extérieure (7; — s2)[w1w2}, donc 





d(sw: + rw) 
1 — Sp = ————— 
[ww] 
et l'équation (24) devient 
d + 
(25) anaes etm. 
[wie] 


Nous retrouverons la forme swi+ rw: au paragraphe suivant. 


II. Surraces HI IsOTHERMIQUES 


8. Une surface est isothermique si les arcs élémentaires des lignes de cour- 
bure ont un facteur intégrant commun. D’aprés (1) et (2), les formes minima 


en auront un aussi. Appelons-le yu, et exprimons que pw; et pw, sont des 
différentielles exactes; il vient 


we—m=0, uw —s=0 
d’od 
du 
— = sw, + rwe, 
m 


ce qui s’énonce: sur toute surface isothermique, la forme sw,+ rw: est une différen- 
tielle exacte, et e/**'*"* est un facteur intégrant a la fois pour w, et pour ws. 

Pour rappeler ce réle de sw:+ rw2, nous l’appellerons la forme isothermique 
de la surface. L’équation différentielle des surfaces isothermiques se déduit 
de ce qui précéde; en exprimant que sw:+ rw est une différentielle exacte, on 
trouve l'équation 

m1—s=0; 
elle est du quatriéme ordre. 

Les lignes de courbure des surfaces isothermiques forment un réseau iso- 
therme; on peut donc leur attacher une variable complexe z. Pour cela, posons 
(26) ef witrer oy, = dzo, ef sitres @. = dz 


(zo = x — ty, 2 = x + ty). 
La variable complexe z répond a la question, car on voit, d’aprés (1) et (2), que 


(27) dx = 4/4 emer, dy = /4 ef ertret a, 


ce qui montre qu’on obtient bien les premiéres lignes de courbure, w2= 0, en 


égalant 4 une constante la partie imaginaire de z, et les secondes, la partie 
réelle. 


9. Soit maintenant une surface a la fois HI et isothermique. La formule 
(25) montre qu’ alors, l’angle 8 sous lequel les lignes H = C coupent les pre- 
miéres lignes de courbure est une fonction harmonique, ce qui s’explique, 
puisque les lignes H = C sont isothermes. !] s’ensuit que, dans ce cas, les 
courbes primitives ¥(u , v) = C sont isothermes, elles aussi, car elles coupent les 
premiéres lignes de courbure sous un angle a qui, d’aprés le n° 6, vaut —8, et, 











SURFACES A COURBURE MOYENNE ISOTHERME 13 


par conséquent, est harmonique. y est donc une fonction de fonction har- 
monique: ¥ = g(U + V). 

Ce résultat se vérifie facilement par le calcul. En effet, la forme isother- 
mique, sur une surface HJ, s’écrit d’aprés (21), 





1 Yur 
(28) $a, + fw, = — Wu dy. 
Pour que la surface soit isothermique, il faut donc, et il suffit, que wr , ou 


Aw 


Aw’ soit fonction de ¥. Or cela signifie précisément que les courbes y = C 
i 


sont isothermes, que ¥, par conséquent, est de la forme g(U + V). Donc, la 
condition nécessaire et suffisante pour qu'une surface HI soit isothermique, c'est 
que la fonction primitive y puisse s’écrire g(U + V). 

U et V sont imaginaires conjuguées. On peut regarder V comme la variable 
complexe attachée au réseau isotherme formé des courbes primitives et de leurs 
trajectoires orthogonales. 

Les trois variables complexes v, z, et V sont trois intégrales premiéres de w», 
elles sont donc fonction l'une de l'autre. En particulier, V est fonction de 
x + ty: 

V = P(x,y) +iQ(x,y) (et U = P(x, y) — iQ(x, y)) 
de telle sorte que 
U+V=2P et wp = g(2P). 

Pour une surface HI isothermique, les formules (26) se simplifient. Expri- 

mons d’abord le facteur intégrant e/**'*"?. On a vu que 


1 Vur 

none Oe. 

2 dave 

Or, puisque ¥ = g(U + V), ona, inversement, U + V = G(y), et 
Vue og” G"'(y) 


Swit T.>= ~— 


donc, a un facteur constant prés, 


efmitres = VG'(y) 
et, par suite, —_ 
fetter asp = V GY) - Ve dv = VG, dv = VV" do. 


Les formules (26) deviennent donc 


(29) dzy = Vu’ du, dz = VV’ dv. 
Cette derniére formule peut s’écrire 
dz = dVdv, 


ce qui met en évidence le fait, énoncé au n° 6, que les lignes de courbure bis- 
sectent les lignes H = Cet y = C. 





VICTOR LALAN’ 





Ill. Surraces HI 1sOTHERMIQUES ET W 


10. Si la surface est W, c’est-a-dire, s’il existe une relation entre H et A, 
la formule (13) montre que rw;+ sw, est une différentielle exacte, et récipro- 
quement. De 1a découle I’équation différentielle des surfaces W: en exprimant 
que 7w;-+ sw, a une différentielle extérieure nulle, on trouve |l’équation du 4° 
ordre 

a—-nrmtr—s= 0, 
qui peut s’écrire aussi 





sur les surfaces W, x et ¢: ne sont donc pas indépendantes; c'est ce qui ressort 
de la relation (n° 4) 


do; = [dx]. 
11. Si une surface est a la fois W et isothermique, les formes rw;+ sw. et 
$w1+ rw, sont, l’une et l'autre, des différentielles exactes. Posons donc 
dd dp 
- © Twy+ Sa: >". 
M 


r 


Par addition, puis par soustraction, il vient 


nN 
(s + r)(w1 + we) = dlog(Ap), (s — r)(wi — we) = d log af: 


SW, i Ta, = 


Or, w1+ w2 est proportionnel 4 w;, lui-méme proportionnel a dx, d’aprés (27). 


r 

Donc Ayu ne dépend que de x et, de méme, ~ ne dépend que de y. Par con- 
v 

séquent 


A = X(x) ¥(y) =) 
—— *  e 
d’oti cette p ition: sur toute surface W isothermique, la forme isothermique 


est la différeni.. logarithmique d'une fonction X (x) Y(y), et la forme o,=1rw; + 
sw: estladiffér elle logarithmique d'une fonction X(x)/ Y(y), x et y étant les vari- 
ables harmoniques associées qui restent constantes le long des lignes de courbure. 


12. Passons a l’examen du cas od la surface serait a la fois HJ, isothermique, 
et W. Nous avons obtenu le résultat suivant, que nous croyons nouveau: les 
seules surfaces qui soient a la fois HI, isothermiques, et W sont, outre les surfaces 
de révolution, les cylindres, et certains cones, celles sur lesquelles la fonction primi- 

’ , 


tive y est de la forme k log (U + V), U et V ant telles que (U+V) ne dépende 


que de u + v. Ces derniéres surfaces, nous le verrons par la suite, sont les 
surfaces d’Ossian Bonnet de troisiéme classe. 








ou 


L’ 
(3: 








ee 


SURFACES A COURBURE MOYENNE ISOTHERME 15 


La condition que la surface HJ soit isothermique se traduit par y= g(U+ V) 
(n° 9); celle que cette surface soit W, si l’on tient compte de l’expression de A 
(formule 21), s’exprime par 


Vue 
——— = of 
= p(u + v). 

Vibe 
En combinant ces deux conditions, on obtient 
g'(U + V) 
g'(U + V)° 

On satisfait 4 cette équation en supposant que U + V est fonction de u + », 

une fonction linéaire naturellement. A cause de |’indétermination qui sub- 
siste dans la définition de u et v (n° 6), on peut alors prendre simplement 
U =u V=v. Les formes quadratiques d'une telle surface seraient 
g"(u + 2) 
a Se, 
f'(u + 2) 

g’(u + ») 

= g'(u + v)du? + 2f(u+v 

@ = e(u + v)du’ + flu + 0) FO 
Tous les coefficients sont fonction de u +. Les surfaces correspondantes ad- 
mettent donc une infinité de déplacements sur elles-mémes, par u’= u + ia, 
v’= v — ia, et une infinité de symétries par u’= v + ib, vo = u — ib. Les 
lignes u + v = const., qui glissent sur elles-mémes et admettent des symétries 
par rapport a des plans, ne peuvent étre que des droites ou des cercles. Si ce 
sont des cercles, on a des surfaces de révolution, c'est le cas général. Si ce sont 


des droites, c’est-a-dire si g’= Cf, on a des cylindres. Nous n’insistons pas 
davantage sur ce cas simple. 


(30) VU'V' = p(u+v). 


ds?= 2 








dudv + g'(u + v)de*. 


13. Pour écarter la solution précédente, supposons que U + V = 7 et 
u +v = t soient des fonctions indépendantes. On doit déterminer g(r) et les 
fonctions U(u) et V(v) pour que (30) soit satisfaite, mais il faut en outre que 
l’équation de Gauss (23) soit vérifiée; celle-ci s’écrit, en appelant K la courbure 
totale, 
dad? . 3° Vue 
(31) ap 8 Fl = dy 108 Wael + “Gr K. 


La courbure totale K est, ici, fonction de?. Par ailleurs, ~ = g(r) donne y,,= 
g’’U'V’, si bien que 

= log Yue] = Suo- log |e" = 3 log |e”| . UV" 
audy °F Worl = Guay SIE ga ele ON 











ou, en tenant compte de (30) 
a g”? ' 
oo a | \er 
adudv log luo v (t) . eg’? (log \g | ) . 





L’équation (31) devient donc, en utilisant de nouveau (30) pour exprimer yy», 
e ’ 2 lov] \er g” 
(32) dt? log \f'| = g'”? (log \g |) g” . f 


2 
p 
ry * 











16 VICTOR LALAN 


Dérivons par rapport a r qui, par hypothése, est indépendant de f: 

7) ’ K(?) g” , 

(33) 0= 15 (log |g” yr +—— “) 
rf) \e"7s 

Nous avons divisé par p* qui ne peut étre nul, d’aprés (30), sans que g”’ le soit, 
c’est-a-dire, sans que ¥ soit harmonique, ce qui est exclu. 





n\t ” 
A. L’équation (33) est satisfaite si (“) = 0; cela donne, en effet, &- = 4 
. ge’ om 
g” 
d’od, comme le montre un calcul facile, =; (log |g’”’|)" = 2. Cette solution 
Vue 





s’écrit aussi = m: c’est une condition qui, nous le verrons plus loin, 


Vutec 


caractérise les surfaces d’Ossian Bonnet. En |’intégrant, on trouve 


1 
y= om Step (+ V). 
L’équation (30) donne alors 
VU'vV' 
U+V 
et nous montrerons (n° 24) que cette condition détermine U, V, et p. L’équa- 


tion de Gauss (32) devient 


a 1 =( r) 
—= = 2 > Som _ . 
a log |f"| 26 + pW ; 
Quand yj, (c’est-a-dire U et V), a été déterminé, et p en conséquence, f n'est 
assujettie qu’a vérifier cette équation différentielle du troisiéme ordre: il y a 
donc une triple infinité de surfaces essentiellement différentes correspondant 

4 la méme fonction primitive y. 


= — p(u +r), 


ld 


r 
14. B. Cherchons a satisfaire (33) autrement, en supposant (<) #9; elle 


peut alors s’écrire 





= 7 ”” ‘ 
.on (log |g’’| ) \ K(’) 


(5) ~ Jao 


La valeur commune de ces rapports ne peut étre qu’une constante, soit a, d’ot 
en intégrant ce qui concerne g, 


(34) 





12 /2 
(35) ora log [e"|)" = 0 + 0. 


Mais, de l’équation (30), on peut déduire une autre équation différentielle 








SURFACES A COURBURE MOYENNE ISOTHERME 17 


| 


que doit vérifier g. Prenons le logarithme des deux membres, et dérivons 
d’abord par rapport a u, puis par rapport a 2, il vient 


0 = (log p)"# + (tog £° ) U'V’, 
thas 


ou, en remplagant U’V’ au moyen de (30), et séparant les variables, 





(tog £) , 
(36) g/t _ (log p) a 
” ?” 
g’? 


Comme précédemment, la valeur commune de ces rapports est une constante, 
c, d’ou, pour g, 


(37) (tox) =ch. 
g - 


Il faut chercher les solutions communes a (35) et (37). Eliminant (log |g’’| )’”’ 
entre ces deux équations, il vient 


(38) (logle'|)’= a”+ (6 — ES, 








qui peut s’écrire 


Bag’ t+ (b—c+1)£, 
g 


ou encore 
(log |g”| )’ = ag’+ (6 — c + 1) (log |g’|)’. 
Dérivons en tenant compte de (35) et de (38): 


ag’ + pf = ag’ +(b—c+ v| oe” +0 - o& |. 
g” g” 


et, en divisant par g’’, qui n’est pas nul, puisque y n’est pas harmonique, 


a em eee ee ee 


(39) [ce - 6 - of =a-c+)). 
g” 


” 


Or notre hypothése actuelle est ue & n'est as une constante; il faut donc 
ypo q g? P 


que (39) s’évanouisse, c’est-A-dire qu’on ait 





c = (b —c)* et afb —c +1) = 0, 
d’ot deux hypothéses possibles: 
(a) c=(b-—c), b-—-c+1=0, 
t (b) c = (b —c)’, a = 0. 
| 15. L’hypothése (a) équivautac = 1,5 = 0. L’équation (35) donne alors 


(log |g’’| = ag’, d’od log |g”| = ag + r+. L’équation (37) devient 














18 VICTOR LALA'N 


(log |g’’| )” = (log |g'| y+. Portons-y la valeur trouvée pour leg |g”'|, 


et développons le second membre: 


uf 


_ 
tea d’od log |g”| = ag + 9, 
ce qui est compatible avec I’expression antérieure, en y faisant p = 0. Aijnsi 
g peut étre déterminée de facon A satisfaire (35) et (37). 
Cherchons maintenant a déterminer f(t). L’équation (34) donne 
K = — af’. 
42 
Or K = #*—- A?’= f? -£ ; Téquation ci-dessus donne donc p? = 


42 





f+ af’ 
Mais on a, d’aprés (36), (log p)”’ = p”, ce qui, exprimé en f, devient 
2f” 
(40) (log f”)”— flog (f2+ af)’ = —2 
og f og (f*+ af P+ af 


Par ailleurs, f doit satisfaire 4 l'équation (32), qui, compte tenu de !'expression 
trouvée pour log le”, se réduit a 


d? 4. 
(41) =a los if’| = 0. 


Les équations (40) et (41) n’ont aucune solution commune, comme on s’en 
assure sans peine: I'hypothése (a) ne donne donc rien. 


16. Dans l’'hypothése (5), on a 
c=(b-—c)*, a=0. 
De a = 0, on déduit que, dans (33), le premier terme du second membre est 
nul (d’aprés (35)); il faut donc que le second soit nul aussi, donc K = 0, les 
surfaces seront développables. Ce seront des développables sans aréte de re- 
broussement, puisqu’elles doivent étre isothermiques; ce ne seront pas des 


cylindres, pour lesquels 7 et ¢ ne seraient pas indépendants (n° 12): ce seront 
donc des cénes. L’équation (35) se réduit a 


£” (log |e"|)"= &; 
g 


112 


on calcule c par (37). Portant ces expressions de 6 et c dans c =(b — c)*, on 
trouve, aprés simplification, 


12 w\e 
_ (tox £) = [( log g’)”’F, 
g g 

d’ot l'on tire g’=(pr + g)*, ou plus simplement, puisque 7 n'est défini qu’a 
une transformation linéaire prés, 


g’ =r. (v -4=,<=4), 


9 
a?” 





On doit écarter a = — I, car cela entrainerait = = const., ce qui est exclu. 
g 


| 





5, ee ee. pe. es 





SURFACES A COURBURE MOYENNE ISOTHERME 19 





Donc g = - r*t1, c'est-a-dire, y = = S (U + vy. 
1 a+l1 


"2 
Voyons si l’on peut déterminer f(t). Puisque K = 0, on af? — ce 





* 
‘2 

donc p? = 5 . L’équation de Gauss (32) devient 

i 1— ef" 
42 — log “a 0 
(42) a iho 7 
et (36) donne, puisque c = —, (log p)” = . ou, en remplacant p par sa 
valeur 
(43) og|£|-45- 


: 


Retranchant (42) de (43), on co lf| = - , dod, par un 


e 
a f? 
choix convenable de l’origine des /, fe! == ce , qui donne 


j-(Jaret 


Reste a aera, la forme de U(u) et V(v), par (30), qui, étant donné que 
g = ret p= ee s’écrit 


U'v’ l 
(4) w+ F 
Cette équation sera étudiée plus loin (n° 24); contentons-nous d’indiquer ici 
le résultat. La solution U= u, V= v n’étant pas acceptable, puisque U+ V 
doit étre indépendant de u + »v, on doit prendre 
. 1 1 
a ee 


eomeg(tt) -(‘) 
¥= G+ 1) \w y I" , 


On peut remplacer u et » par hu et hv, A condition de remplacer ¥ par » et 


d’ou 


l’on obtient ainsi 
m 1 1) atl 1 
= al me ’ = t. = . 
v a + 1 ( + ;) f ares & (a + 1)k*t+h* 
Les formes quadratiques de la surface sont 
dud 1 1\*f/du d 
(8) atm mE. o = —m(5 +2) (FF): 


(uv u v 














20 VICTOR LALAN 

Pour la réalité il faut m > 0. Introduisons des paramétres réels r et @ par 
u=re”, v =re-* (r > 0) 

les formes deviennent 

(45’) ds? = 2m —————_-, @ = 4m- 


pitt ’ 





2 : 9)* 
dr? + r°de@ (2 coe 6)" ae 
r 


En posant 
V 2m 1 Ze V2 ee 
la u*’ oe 
on applique la surface sur un plan, car le ds* devient aZéZ,, ou, en coordonnées 
polaires Z = Re™, ds? = dR® + R*d0®*, avec les relations 


Z.= 


=> | | = 9 Q — = a, 
2V 2 o\* 
et la seconde forme s’écrit ¢ = — R (2 cos *) dQ*®. Les génératrices 
a a 


@ = Cde la surface sont représentées sur le plan par les demi-droites issues de 
a ’ : il reli ad . a 
l'origine, ce qui montre bien qu’il s’agit de c6nes. Les droites Q = + "3 du 


plan représentent des génératrices d’inflexion, si a > 0, des génératrices de 
rebroussement, si a < 0. Nous n’envisageons qu’une portion du céne com- 
prise entre deux génératrices de cette sorte, portion qui, sur le plan (v), serait 
représentée sur le demi-plan de droite; le c6ne entier s’obtient a partir d’une 
telle portion par des symétries relativement a des plans ou a des droites. 

Si l'on coupe le c6ne par la sphére unité centrée 4 son sommet, on trouve une 
courbe dont la courbure géodésique, relativement a la sphére, est identique a 
la courbure normale de la méme courbe relativement au cO6ne, c’est-d-dire a 


2V 2\* 
we (2 cos *) . Comme lI’arc élémentaire de cette courbe est ds = dQ, 
a a 


son équation intrinséque est 


1 7" ~(2 ‘y’ 
(46) < 2 cos 
- ~ 


Po 


On constate bien qu’ elle présente, pour s = , des inflexions si a > 0, 


Ta 
2 
des rebroussements si a < 0. 

Nous savons que, sur ce cOne, les lignes H = C, y = C sont des courbes 
isothermes. Dans l’application, elles deviennent des courbes isothermesdu 
plan, qu’il est facile de déterminer. Les courbes H = C ont pour équation 


1 
, : > 2 
u+v=C, our cos @=C, ce qui devient, en coordonnées R et 2, R «cos— = C: 
a 


ce sont les courbes obtenues en égalant 4 une constante la partie réelle de la 


1 
: , — , . -_ 
fonction analytique Z *. L’équation des courbes y = C est — +-— = C, ou 
“uv 


—$—$—$—$ 
CE i ON 
—— 








cc 


Cc 
I 
I 
c 


-— —_. 








me | a ee 





SURFACES A COURBURE MOYENNE ISOTHERME 21 
1 : . 2 Q . 
~ COs 6 = 0, c’est-Aa-dire Re cos—- = C. Sur le plan, les courbes y = C sont 
a 


les inverses des courbes H = C, dans une inversion de péle 0. Cette inversion 
conserve les droites passant par 0 et les cercles centrés en 0, images des lignes 
de courbure: on vérifie que les lignes de courbure bissectent les lignes H = C 
et ¥ = C. 


| 
En rapprochant |’équation des courbes ¥ = C de |'expression de - pour la 
v 


courbe sphérique directrice du c6ne, on voit que, sur ces courbes, R varie pro- 
portionnellement a py, d’od cette proposition: si l'on porte, a partir du sommet, 
sur chaque génératrice du céne, une longueur égale au rayon de courbure géodésique 
de la courbe sphérique intersection du céne et de la sphere unité, on obtient une 
courbe primitive du céne; les autres courbes primitives sont homothétiques a celle-la, 
les courbes d’égale courbure moyenne sont les inverses des précédentes, le centre 
d’ homothétie et le pile d’inversion étant le sommet du céne. 

On peut d’ailleurs remarquer que, sur tout c6ne, si l'on porte a partir du 
sommet, sur chaque génératrice, une longueur inverse du rayon de courbure 
géodésique de la courbe sphérique déterminée par le c6ne sur la sphére unité, 
la courbe obtenue est une ligne d’égale courbure moyenne. II y a lA un moyen 
de déterminer directement les c6nes 4 courbure moyenne isotherme, et, par 
conséquent, de contréler nos calculs. La fonction f(s), qui figure dans l’équa- 
tion intrinséque p, = f(s) de la courbe sphérique directrice du c6ne, doit étre 
telle que, dans le plan Z = Re®, les courbes Rf(2) = C, soient isothermes; on 


: . 1 s\* 
retrouve bien, comme le montre un calcul facile, — = k (cos ‘) ° 
Po a 


IV. SurFaces W APPLICABLES SUR DES SURFACES DE REVOLUTION 


17. Le ds* d’une surface applicable sur une surface de révolution peut 
s’écrire ds*= 2 F(u + v)dudv. La courbure totale EK sera donc, elle aussi, 
fonction de u + v; nous supposerons que K n’est pas une constante. 


Supposons en outre que la surface soit W; la courbure moyenne sera fonction 
de K, donc de u +», ce qui revient a dire que la surface sera 4 courbure 
moyenne isotherme, ou constante. N’examinons que le cas od la courbure 
moyenne est variable, H = f(u + v). La fonction primitive de la surface sera 
don. telle que ¥.»= Ff’; donc py» ne dépendra que de uw +. Par ailleurs, 


l'asphéricité A, dont I’expression est ~-¥*¥* 7”, dépend, elle aussi, unique- 
ment de u + v; donc, le produit ~~, est fonction de u + v. D’od ce premier 
résultat: si une surface W, ad courbure totale et a courbure moyenne variables, est 
applicable sur une surface de révolution, c'est une surface HI, sur laquelle ~.» et 
Vu» sont fonction de u + v, comme H. 











22 VICTOR LALAN 


Il est facile de trouver toutes les surfaces HI dont la fonction primitive jouit 
de ces deux propriétés; il suffit d’utiliser l’identité suivante, que nous avons 
déja signalée ailleurs: 


Vu Vue ur? 
47 (2) (4=-) = (I uVe)ue — 2——. 
ale WS. OS. ee Vue 
18. Supposons que les deux fonctions de u + v, Puy et Wu» soient linéaire- 


Yur 


La formule (47) fournit alors une relation linéaire entre ¥, et ¥,, dont les co- 
efficients ne dépendent que de u + v. Cette relation, jointe a l'expression de 
Vub, en u + v, montre que y, et ¥, sont séparément fonction de u + v. Comme 


Vu» ne dépend, lui non plus, que de u + v, on a nécessairement des expressions 
telles que 


ment indépendantes, autrement dit, que le rapport 





ne soit pas constant. 


Vu= g(u+v)—ia, Po= glu + v)+ ta. 
On montre que ces surfaces sont, en général, des hélicoides, ou, si a = 0, des 
surfaces de révolution. Le cas particulier od l'on aurait ¥,,. = Cf’ corres- 
pondrait 4 un cylindre (n° 12), mais il ne doit pas étre retenu, puisque nous 
supposons la surface 4 courbure totale variable. 


Vuv 


uyov 


condition déja rencontrée (n° 13 A). qui caractérise, comme nous le verrons, 
les surfaces d’Ossian Bonnet. Nous obtenons donc le théoréme: si une surface 
W, @ courbures totale et moyenne variables, est applicable sur une surface de révo- 
lution, c'est, ou bien un hélicoide, ou bien une surface de révolution, ou bien une 
surface d’Ossian Bonnet. 





19. Supposons au contraire que le rapport soit constant; c’est une 


V. SurRFACEs p’Osstan BONNET 


20. Les surfaces d’Ossian Bonnet sont les surfaces susceptibles d'une in- 
finité de déformations avec conservation des courbures principales. Nous 
établissons d’abord leurs équations différentielles. 

Soient S et S deux surfaces applicables l'une sur l’autre avec conser- 
vation des courbures principales. Leurs éléments linéaires respectifs sont 





——et - Comme A = A, par hypothése, l’isométrie exige 
(48) W1We = Wwe. 

La formule de Codazzi (13) donne d’autre part, puisque H = H 
(49) Toit Swe = Toit+ SWoe. 


Une surface est surface d'O. Bonnet si ces deux équations en 1, w: ont une 


ee 





————————_———— TL TT TT. 





SURFACES A COURBURE MOYENNE ISOTHERME 23 


infinité de solutions. L’équation (48) exige, compte tenu de la réalité, que 
(50) @i= ew, w2= ewe, 
ou @ est l’angle que fait, aprés application, la premiére ligne de courbure de S 
avec la premiére ligne de courbure de S. (49) donne ensuite 
(51) r=re” 5s = se”. 
Mais en différentiant extérieurement (50), et remarquant que [wiws] = {wywsl, 
on obtient 
r =(r — is)e”, 5s =(s + i0:)e~” 
d’od, en éliminant 7 et s a l'aide de (51), 
i0,= s(e"—1) i0.= r(1 — e**) 
et enfin 
(52) idd = s(e*”— 1)w,+ r(1 — e7 2”) wo. 
Il ne reste plus qu’é exprimer que cette équation de Pfaff est complétement 
intégrable, ce qui donne 
(53) mntrs=0, set rs = 0; 
telles sont les deux équations, du 4° ordre, des surfaces d’O. Bonnet. 


21. Le théoréme d’O. Bonnet, d’aprés lequel ses surfaces sont isother- 
miques, se lit sur les formules (53), car on en tire r;= se, ce qui est l’équation 


T; Se 
=, ¢ — 


des surfaces isothermiques (n° 8). On en tire aussi s = — = — 


Notre forme x (n° 4) s’écrit donc 


-(-2-*)+(-2-2) a= — d log rs. 
r s e ff 


C’est une différentielle exacte, donc (n° 4), les surfaces en question sont des 
surfaces HI. En se reportant aux formules (21) on voit que 


Vue _ L _ 4 Wavedt 
Vue 





—dlog rs et — 
s rs Yur? 


x = d log 
d'ou, en éliminant rs, 


(54) Vur =m 

Vue 
Réciproquement, toute surface HI dont la fonction primitive a ses deux para- 
métres différentiels proportionnels est une surface d’O. Bonnet. En effet, de 





(54) on déduit r = -3 ts = — eV d’oa 
2 
n= -= Vue = So= — 1S. 


En définitive, la condition nécessaire et suffisante pour qu'une surface soit sur- 
face d'O. Bonnet, c'est qu'elle soit une surface HI et que sa fonction primitive ait 
ses deux paramétres différentiels proportionnels. 








24 VICTOR LALAN 
On doit remarquer que les équations (53) sont satisfaitessir = 0,s = 0; (52) 
donne alors @ = const. Les surfaces d’O. Bonnet correspondantes sont @ ceur- 


bure moyenne constante: ce sont les surfaces d’'O. Bonnet de premiére classe; nous 
ne nous en occuperons pas. 


22. Revenons sur la propriété que posséde toute surface d’O. B. d’étre 
isothermique. De (54), on tire par intégration 


1 . 
(55) ¥ = ——log (U+ V), 

m 
ot U et V sont, pour la réalité, imaginaires conjuguées. La forme isothermique 
$w1+ rw: s' écrit donc ~s in. dy = — 5 ay, et le facteur intégrant commun 


m 
& w: et ws, qui est en général e*”"*", devient ¢ ah VU+V. Nous 
poserons U = P — iQ, V = P + iQ, et nous nous restreindrons A une région 


de la surface ok} P>O. P = et? 





est une fonction harmonique sur la 
surface, et l’on a 


ld 
(56) Swy+ T@2. = LS ° 


Réciproquement, s'il existe une fonction harmonique P telle que la forme 
isothermique puisse s’écrire ainsi, on aura 
P,= 2sP, P2= 2rP, Py2= 252P + 4rsP, 
et enfin 
Pi2— 1Py= 2(52+ rs)P. 

Or, puisque P est harmonique, P:.— rP,= 0; donc, sur de telles surfaces, 
Se+ rs = 0; on montrerait de méme, en formant P.,— sP2, que 71+ rs = 0, 
donc la condition nécessaire et suffisante pour qu'une surface soit surface d’O. 


Bonnet, c'est que la forme isothermique soit la demi-différentielle logarithmique 
d'une fonction harmonique. 


23. Les surfaces 4 courbure moyenne constante sont, nous I’avons dit, les 
surfaces d’O.B. de premiére classe. Les autres surfaces d’O.B, 4 courbure 
moyenne variable, se répartissent en deux classes, comme le montre I’étude 


1 
de l’équation de Gauss. Puisqu’ici, ¥ = — rn log (U + V), ona 


~ - : a t F l ly | - 2y 
“m(U+V)P ~ au Ye! at 
L’équation (23) devient alors 


Yur 


(57) £ log |f’| = wus f + 2m) - 





A, _ AA I LO LED LLL LLL LL 








S 


(i 








ee ee 








i 


SURFACES A COURBURE MOYENNE ISOTHERME 
. , . , , C) ) ' 
Si on lui applique l’opération D = ~~ ae obtient 
u 


(58) 0 = Dur’ (£ + 2m) 
d’od deux possibilités: 
ae ' 2m : _— 2 

ou bien f + 2m = 0, qui donne f = i+" Cette fonction satisfait & (57) 
sans qu’aucune condition soit imposée aux fonctions U et V: ce sont les sur- 
faces d’O.B. de deuxiéme classe. On peut déterminer complétement I’expression 
de leurs coordonnées en fonction de u, v et de deux fonctions arbitraires, mais 
elles sont imaginaires, comme l’a montré M. E. Cartan; 

ou bien Dyu.= 0: ue n’est fonction que de u +: ce sont les surfaces 
d’O.B. de troisiéme classe, que nous allons étudier. 


24. Les surfaces d’O. Bonnet de troisiéme classe jouissent de plusieurs 
propriétés qui sont évidentes sur nos formules. Leur élément linéaire étant 

Vu 

bg 
comme f’, elles sont applicables sur des surfaces de révolution (n° 17). En outre, 
Vue 
m 
on voit que A n’est fonction que de u + v, comme H, si bien que A est fonction 
de H et que, par conséquent, ces surfaces sont des surfaces W. Ces propriétés 
sont du reste bien connues. 

D’aprés la formule (55), les courbes primitives y = C sont des courbes 
isothermes. Conjointement avec leurs trajectoires orthogonales, elles forment 
un réseau auquel est attachée la variable complexe V, en ce sens qu’on obtient les 
courbes ¥ = C et leurs trajectoires orthogonales en égalant 4 une constante 
soit la partie réelle, soit la partie imaginaire de V. Nous poserons V= P+ iQ; 
cette variable, qui est attachée aux courbes primitives dans le méme sens que 
la variable 0 = p + ig est attachée aux lignes d’égale courbure moyenne, sera 
appelée la variable primitive. Puisque V et v sont deux intégrales premiéres 
de w:= 0, il y a une relation analytique V(v), que nous appellerons la relation 
primitive. 

Sur ces surfaces, on a, d’aprés (55). 

U'v’ 
(59) ee mes 
ce doit étre une fonction de u + v, donc les dérivées logarithmiques par rapport 
a u et par rapport a v sont égales, ce qui donne 





2 dudv, comme pour toute surface H/, et ¥.,» ne dépendant que de u + », 


en se reportant a l’expression (21) de A, et en tenant compte que Puy» = 


(60) 





U~ U4" W~o40 
ou, en posant, pour abaisser l’ordre, U’ = A(U), V’ = u(V), 
(61) (\’ — w’)(U + V) = 20 — »). 











26 VICTOR LALAN 


Des dérivations par rapport 4 U, puis par rapport a V, donnent 
(62) W(O + VO =e (U+ Vi = WV +p’. 
On en déduit 


” = yp” = 2a (a, constante réelle) 
d’ot 
NY = 2aU + by, pw’ = 22V + D2 (b;, b2 imaginaires conjuguées). 
Portant dans (62), on obtient 5; + 6. = 0, donc 
b= 21b, be= — 2b (b, constante réelle) 


et (61) donne ensuite 
\ — aU? — 2bU = wp — aV*+ BWV, 
d’ot 
A=al?+ WU +c, p= aV?— WV +c (c, constante réelle). 
La relation différentielle primitive est donc 
dV 


(63) aV* — 2bV +c. 


25. Nous distinguerons trois types de surfaces, suivant la nature des racines 
Vi, Ve du triné6me a V?— 21bV + c, ot a, db et c sont réels: 
type A : b*?+ ac > 0, 2 racines distinctes, imaginaires pures; 
type B : b°+ ac < 0, 2 racines distinctes, symétriques par rapport a l’axe 
imaginaire ; 
type C : b°+ ac = 0, 2 racines confondues sur l’axe imaginaire. 

Les types A et C ont deux réalisations: la normale, a # 0, et la spéciale, 
a =0. Dans le type A spécial, une des racines est rejetée a l'infini; dans le 
type C spécial, les deux racines sont infinies. 

On peut simplifier la relation primitive différentielle (63) en utilisant le fait 
que les variables complexes V et v ne sont pas parfaitement définies. Pour V, 
c’est seulement par sa partie réelle qu'elle intervient, puisque, seule, figure dans 
les formules la somme U + V = 2P; on peut ajouter 4 V une constante ima- 
ginaire pure, 4 condition de retrancher de U la méme quantité. De plus, si l’on 
multiplie U et V par une méme constante positive, y est simplement augmentée 
d’une quantité constante, ce qui ne change rien a la surface. Quant a 9, on 
peut la remplacer par av + 6b + ic (a, b, c réels), A condition de remplacer 
u par au + b — ic, ce qui remplacera u + 2 par a(u +0) + 2b. Il ne faut 
pas oublier que, dans ce changement, y ne reste pas invariante (n° 6), 


, - > W -_ : 
mais devient ¥ = — ; en particulier, si u et v sont remplacées par — u et — 2, 
a 


¥ devient —y. On obtient de la sorte les formes réduites: 








dV 
A l, d= V = te 
norma jas ran tgv 
dV ; 
A spécial, dy = V = — 1e" 


iV’ 








SURFACES A COURBURE MOYENNE ISOTHERME 27 


dV 
, dy = ———,, y = 
B v i-y V th v 
dV 
Cnormal, dv= —-—, T = i 
y? v 
Cspécial, dv=dV, VY as, 


On a intégré de fagon que les axes imaginaires se correspondent, ainsi que les 
demi-plans positifs, dans les plans complexes (v) et (V). 

Les expressions correspondantes de ¥y» sont 

1 1 l 
m sin? t ’ ‘msh?t ’ mt’ 
et l'équation de Gauss (57), que doit vérifier 7 = f(t), revét aussi trois formes 
différentes, suivant le type considéré. 


(64) A, 





26. La relation (63) montre que les lignes du plan (V) le long desquelles 
dv est réel sont des cercles (ou droites) orthogonaux a l’axe imaginaire. Donc, 
quel que soit le type considéré, si l'on fait la carte de la surface d'O.B. sur le 
plan (V), les trajectoires orthogonales des lignes H = C ont pour image un faisceau 
de cercies (ou droites) orthogonaux a l’axe imaginaire; les courbes H = C elles- 
mémes sont représentées par le faisceau orthogonal au précédent. 

Le faisceau de cercles qui représente les trajectoires orthogonales des lignes 
d’égale courbure moyenne peut avoir ses points de Poncelet sur l’axe ima- 
ginaire, distincts (type A), ou bien ses points de base symétriques par rapport 
a l'axe imaginaire (type B), ou bien ses points de base confondus sur |'axe 
imaginaire, l’axe radical étant perpendiculaire a l’axe imaginaire (type C). 
Dans le type A normal, les courbes H = C sont représentées par des arcs de 
cercle limités aux points V,, V2; dans le type A spécial, elles sont représentées 
par des droites rayonnant de V;, V; étant a l'infini. Dans le type B, les 
courbes H = C ont pour image un faisceau de cercles ayant comme points 
de Poncelet V; et V2, symétriques par rapport a l’axe imaginaire. Enfin, dans 
le type C, les courbes H = C sont des cercles tangents en V,;= V; a Il'axe 
imaginaire, ou, si le type est spécial, des droites paralléles a |’axe imaginaire. 


27. Cherchons maintenant la carte des lignes de courbure sur le plan (V). 
Nous savons (6) qu’elles bissectent les courbes primitives et les courbes H = C. 
Or, surle plan (V), les lignes ¥ = C sont les paralléles a |’axe imaginaire, et les 
lignes H = C forment un faisceau de cercles comme on vient de le voir. Des 
considérations élémentaires montrent qu’en conséquence les lignes de courbure 
seront, sur la carte, des coniques homofocales, les foyers étant les points V; et V». 
Ce seront des coniques a centre dans le type A normal et dans le type B, des 
paraboles homofocales, ayant pour axe l’axe imaginaire, dans le type A spécial; 
dans le type C normal, ce seront les demi-droites issues du point V:= V2 de 
l'axe imaginaire, et les cercles centrés en ce point; dans le type C spécial, ce 
seront des paralléles aux axes. 








28 VICTOR LALAN 


Nous avons montré (n° 8) comment on attache une variable complexe z au 
réseau des lignes de courbure d’une surface isothermique. Ici, comme 


Ssait+ re, = VU V, t = _ ae z Vy’ d 
e + V, et que w2.= V y,dv V m= ULV v 


on pourra prendre 
(65) dz = VV' dv ou dz = iV V' do, 


suivant que m est négatif ou positif; en toute hypothése, on a dz2*= + dVdv, 
Si, dans (65), on remplace dv au moyen de (63) on obtient 


dV _ P id V : P 
VeV- VEC O"Ver-svec’ 
ce qui confirme que les lignes de courbure, y = C, x = C, ont pour carte, dans 
le plan (V), des coniques homofocales. Les formules (66) donnent en outre, 
dans chaque cas, par intégration, l’expression de V en z, c’est-A-dire, de P et 
Qen x, y. 

28. La forme de la fonction P(x, y) est remarquable: P est le produst d’une 
fonction de x par une fonction de y. Cette propriété, dont M. E. Cartan a tiré 
un grand parti, découle d’une proposition plus générale établie antérieurement 
(n° 11). Ici, on a (n° 22), 








(66) dz = 








1 dP 

dae 

donc, en vertu de la propriété rappelée, P = X(x)Y(y). Comme P est une 

fonction harmonique, les formes possibles de X (x) et Y(y) sont trés limitées. 
Nous ne développerons pas davantage la théorie des surfaces d’Ossian Bonnet 

qui, étant donné son grand intérét, mérite une étude a part. Notre intention 


était seulement de les présenter ici 4 titre de spécimen remarquable des surfaces 
a courbure moyenne isotherme. 


Sa,+ Tw, = 


Issy-les-Moulineaux (Seine) 








DISCRETE SPACE-TIME AND INTEGRAL LORENTZ 
TRANSFORMATIONS 


ALFRED SCHILD 


Introduction. Modern physical theory, both classical and quantal, faces 
serious difficulties which arise from the divergence of certain integrals. 
Perhaps the best known of these “‘infinities” is the self-energy of the point 
electron. Most of the simpler devices used to eliminate the infinities, such 
as the introduction of a finite electron radius, are non-relativistic and must 
therefore be rejected. Relativistic theories' which do avoid some or all of 
the infinities are very complicated and often suffer from difficulty in physical 
interpretation. 

The idea of introducing discreteness into space and time has occasionally 
been considered.? It seems likely that a physical theory based on a discrete 
space-time background will be free of the infinities which trouble contemporary 
quantum mechanics. The objection which is usually raised against such 
discrete schemes is that they are not invariant under the Lorentz group. 
The purpose of this investigation is to show that there is a simple model of 
discrete space-time which, although not invariant under all Lorentz transformations, 
does admit a surprisingly large number of Lorentz transformations. This group 
of transformations is, in fact, sufficiently large to make doubtful the validity 
of most physical objections raised against discrete space-times. 

Apart from the physical speculations in the introduction, this paper is of a 
purely mathematical nature. We consider all events in Minkowski space-time 
whose four coordinates ¢t, x, y, z are integers. (The velocity of light is taken 
as unity.) These events form a “cubic lattice’”* in space-time. We first 
investigate the null lines which join lattice points, then the Lorentz transforma- 
tions which leave the cubic lattice as a whole invariant. We shall call these 
integral null lines and integral Lorentz transformations, respectively. We also 
consider the time-like lines through lattice points which are mapped into 
lines parallel to the f-axis by an integral Lorentz transformation. These 
lines will be called integral time lines. 

It may be noted that our model of discrete space-time involves a fundamental 
length* e, namely, the least non-zero interval between lattice points. In the 
present investigation this fundamental distance has been chosen as the unit 
of length. In any physical theory based on our model, ¢ would probably be 
of the general order of magnitude of the classical electron radius (approximately 
10~** cm.). 


Received January 26, 1948 

1G. Wentzel, Rev. Mod. Phys. vol. 19 (1947), 1-18. 

*V. Ambarzumian and D anenko, Z. f. Phys., vol. 64 (1930), 563-567; L. Silberstein, 
“Discrete Space-Time,” ny of Toronto Studies, Physics Series (1936). For a short 
ns of the present popes. aoe hys. Rev., vol. 73 (1948), 414-415. 


Hypercubic” would be the appropriate — we shall retain the shorter form. 
Chow. Heisenberg, Ann. Phys., vol. 32 (1938), 20- 


29 











30 ALFRED SCHILD 


There are two attractive possibilities for making a first rough attempt at 
introducing physical theory on our discrete space-time background. The 
motion of a particle may be assumed to consist of a temporally ordered 
sequence of lattice points such that successive lattice points are joined by 
(a) integral null lines, or (5) integral time lines. In case (a), a particle always 
moves with an instantaneous velocity equal to the velocity of light, but it 
changes direction rapidly so that its average velocity can be quite low. This 
zigzag motion has a striking resemblance to some of the features of the Dirac 
electron.’ Case (6) is rather similar to (a). The main difference is that the 
instantaneous velocity of a particle may now be zero; however it is interesting 
to note that the non-zero velocities associated with integral time lines are 
all very high and exceed 0.86 times the velocity of light (Sec. 8). 

Two of the results which we obtain are particularly striking. The first 
states that the spatial projections of integral null lines are dense (Sec. 4). 
This means that particles, whose motion is of the type (a) above, can have 
instantaneous velocities in practically any direction of space. We shall also 
show that all integral null lines are equivalent in the sense that, given any 
two integral null lines, an integral Lorentz transformation can be found 
which maps one into the other (Sec. 7). 

The second result states that spatial projections of integral time lines are 
dense (Sec. 8). This means that particles, whose motion is of the type (5) 
above, can have instantaneous velocities in practically any direction of space. 

It is obvious that the cubic lattice which we are considering is invariant 
under all translations which map one lattice point into another. In this 
sense our discrete model of space-time is homogeneous. The two results 
stated above show that our model possesses also a large measure of spatial 
isotropy. 

Of any physical theory based on our model of discrete space-time we require 
invariance under integral Lorentz transformations. The integral Lorentz 
transformations are independent of the fundamertal length «. Thus in the 
limit when ¢« tends to zero we expect the resulting equations of the physical 
theory to remain invariant under integral Lorentz transformations, although 
the background is now continuous Minkowski space-time. If the limiting 
equations are at all simple they are almost certain to be invariant under all 
Lorentz transformations, since it is difficult to visualize equations in continuous 
space-time which are invariant under as substantial a subgroup of Lorentz 
transformations as that considered here without these equations being 
completely Lorentz invariant. Thus it is reasonable to hope that equations 
based on our discrete space-time model might be found which, in the limit 
«> 0, take the form of the equations of ‘“‘continuous’’ relativistic physics, 
e.g. Maxwell’s equations, Lorentz’s equations of motion, and Dirac’s equations 
for the electron. These equations of “continuous” physics would be a valid 
approximation for macroscopic phenomena and even for atomic and molecular 


SE. Schriédinger, Sitz. Ber. Preuss. Akad. Wiss., vol. 24 (1930), 418-428. 





me eC 














DISCRETE SPACE-TIME 31 


theory—but they would not be appropriate for the description of nuclear 
phenomena or the theory of elementary particles. 

It is clear that we have merely chosen the simplest discrete model of space- 
time. Other regular point lattices in space-time might be considered and 
perhaps found more useful. In most essentials, however, these lattices would 
behave much the same as the cubic lattice studied here. For example, the 
Lorentz transformations which leave any such lattice invariant would all be 
associated with high velocities. 


1. Gaussian Integers. In this section we collect some well-known 
definitions and theorems concerning Gaussian integers which will be used in 
the sequel. 

A Gaussian integer is a complex number a + ib whose real part @ and 
imaginary part 6 are both integers. A real Gaussian integer is an ordinary 
integer. Gaussian integers can be added, subtracted and multiplied to yield 
other Gaussian integers; they form an integral domain. Here and in the 
following we shall refer to Gaussian integers simply as “integers.” Sometimes, 
when we are dealing with ordinary integers, we shall add the adjective “‘real,”’ 
but usually it will be clear from the context whether integers are real or 
complex (Gaussian). 

The complex conjugate of c = a + ib will be denoted by ¢ = a — ib; the 
absolute value of ¢ by \c| = +(a? + b2)!. 

A unit is an integer which divides all integers. There are exactly four 
units in the Gaussian integral domain: + 1, and + 7. 

A prime p is an integer which is divisible only by the four units + 1, +i, 
and by + ~, + ip. Two integers are relatively prime if their only common 
factors are units. Similarly, a set of integers with units as their only common 
factors will be called primitive; thus a primitive vector is a vector whose 
components form a primitive set of integers. 

One of the most important properties of the Gaussian integral domain is 
that it admits of umique factorization into primes.’ By this is meant the 
following: An integer a can be written in the form 


(1.01) @ = pifr... Pry 


where the p; are primes other than units; if it can also be written in the 
form 


(1.02) @ = ide. -- Qs 

where the g; are primes other than units, then r = s, and, for a suitable 
relabelling of the factors q:,...,q, we have 

(1.03) Pri = tigi, P2 = UeQ2,---, Dr = UrQry 

where %, tée,..., U, are units. 


We shall apply the term real prime to a prime, as defined above, which is 
real. This definition does not agree with the usual one for real integers in 


*G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers (Oxford, 1938), 
184, Theorem 215. 








32 ALFRED SCHILD 


which the criterion is the absence of real non-trivial factors; thus 2 = 
(1 + 4)(1 — 4) and 5 = (2 + 4)(2—%) are not real primes as we have defined 
the term. It is clear that any real integer p can be written in the form 


(1.04) p = gaa, 


where a, @ are complex conjugate integers, and where gq is the product of those 
real primes, each taken once, which are factors of » an odd number of times. 
This g is the real integer of least magnitude for which a decomposition of p 
in the form (1.04) is possible; apart from sign, g is uniquely determined by 
p. Since 2 is not a real prime, g must be odd. 

Although we shall not require it in the sequel, we add the well-known 
theorem’ that among the numbers 2, 3, 5, 7,... (which are usually called 
primes) those and only those of the form 4n + 3, m being a real integer, 
are real primes. 

An integer a + ib, where a and 6 are real, will be called even if a and b are 
either both even or both odd in the conventional sense; a + ib will be called 
odd if one of a, b is odd and the other even. It is immediately obvious that 
for real integers our definitions of the terms even and odd coincide with the 
conventional meaning. The following facts are easily proved: 

An even integer is divisible by the prime 1 + 4, an odd integer is not. 
(Note that the primes 1 — i, — 1 +i, —1 — i differ from 1 + i only by the 
unit factors —i, i, —1, respectively.) The sum of two integers is even if the 
integers are both even or both odd; otherwise the sum is odd. The product 
of two integers is odd only if both factors are odd; otherwise it is even. These 
rules are easy to remember as they are all familiar from the conventional 
properties of even and odd real integers; in the case of Gaussian integers the 
conventional role of 2 is taken over by the prime 1 + 7 which is a repeated 
factor of 2: 

(1.05) 2 = — il +7)". 

We require some further theorems of which the first is a standard result: 

If a and 3b are relatively prime, then there exist integers / and m such that 
(1.06) la — mb = 1. 

Conversely (1.06) implies that a and 6 are relatively prime. 

Equation (1.06) may be regarded as a diophantine equation for the unknown 
integers | and m. If 1, m is a particular solution, then the general solution is 
1 + pb, m + pa, where p is an arbitrary integer. Thus the general solution 
of (1.06), if relatively prime integers a and 6 are assigned, involves one discrete 
complex parameter p or two discrete real parameters. If (1.06) is now regarded 
as a diophantine equation for the four unknown integers a, b, 1, m, then there 
is a discrete sixfold infinity of solutions, since the complex integers a and } 
can be chosen arbitrarily except for the restriction that they be relatively prime. 

In (1.06), a and b are either both odd or else one of them is odd and the other 
even. If a and bd are both odd, then one of /, m must be odd and the other 
even, so that a + b +1 + m is odd. 





Hardy and Wright, Theory of Numbers, 219, Theorem 252. 





—_— ~ 


—— 


em nr ce 


eoo——~ 


mm! 8 ess  —e \. 





EE SS 


—_-——- Cron a On os -::. ne ee | -. -— \e 





DISCRETE SPACE-TIME 33 


In the other case let us, for the sake of definiteness, take a even and b 
odd. Then one of two possibilities can arise: (i) / and m are both odd, so 
that a + 6 +1 + mis odd; (ii) 1 is even and m is odd, sothata +5+/1+m 
is even. Given a solution of (1.06) in which a is even and b, 1, m are odd, then 
(1.07) (lL + b)a — (m+ a)b = 1, 
and a,(i + 5) are even, b, (m+ a) are odd. We easily deduce the results: 

If two relatively prime integers a and 5} are assigned, a being even and 6 
odd, then there exists an even integer / and an odd integer m, satisfying (1.06). 

Equation (1.06) has a discrete sixfold infinity of solutions in integers, 
such that a + 6 +1 + m is even. 


2. Spinors and Tensors. We give here a short survey of the spinor 
calculus* in the form in which it will be applied to our problem. 

In a complex plane (i.e. a plane with two complex coordinates), called the 
spin space, vectors and tensors are defined by their usual transformation 
properties. Thus 
(2.01) c* = dg, 
where the Ag* are constants, is the transformation equation of a contravariant 
spinvector c*. Greek suffixes range over 1, 2 and the usual range convention 
and summation convention for repeated suffixes are assumed. The components 
of c* are complex and we denote their complex conjugates by c*. Then, 
obviously, 

(2.02) c* =X, C, 
where Ag* denotes the complex conjugate of Ag*. Expressions such as a“” 
a”, a*?7, 


etc., are called contravariant spintensors or spinors if they have, 
respectively, the same transformation equations as c*c’, c*c’, c*c’c’, etc. If, 
in a spinor, dots are placed on undotted suffixes and the dots removed from 
dotted suffixes, then the resulting spinor denotes the complex conjugate of 
the original spinor; thus 








(2.03) a® = a”, a’ = a*?”, etc. 
A spintensor a“ which has the symmetry property 
(2.04) a? = a**, 
or, equivalently, 
(2.05) ait =qil, git =a, ai = a®, 


is said to be Hermitian. 
Let us now consider Minkowski space-time which is a flat real 4-space with 
coordinates 
(2.06) (t, x, y, 8) = (x®, x', x*, x), 
and with metric tensor 





*Q. Laporte and G. E. Uhlenbeck, Phys. Rev., vol. 37 (1931), 1381; L. Infeld, Phys. Zeit- 
schréft, vol. 33 (1932), 475. For an early use of a similar technique see also E. Goursat, Ann. 
Ecole Norm. (3), vol. 6 (1889), 20, § 5. 














34 ALFRED SCHILD 


(2.07) goo = 1, gu = gee = gs = — 1, gre = Oforr # s. 

Latin suffixes range over 0, 1, 2, 3 and the range and summation conventions 
are assumed. The Lorentz transformations are the linear transformations of 
the coordinates x” which leave the components of the metric tensor g,, 
invariant and which do not interchange past and future. We shall henceforth 
consider only transformations which leave the origin x” = 0 fixed, i.e. homo- 
geneous linear transformations. Then 


(2.08) <*=a LS x* 
is a Lorentz transformation if 
(2.09) Zmn L,” L* = Brey LL. > 0. 


It immediately follows that the determinant of a Lorentz transformation is 
+1or —1. Lorentz transformations with determinant +1 are called proper. 

We associate a real 4-vector A’ with a Hermitian spintensor a“ by the 
relations: 


ail = A® + A*, A? 
ai? = A! — iA’, A} } (ai? + a*), 
a?! = A! + iA? A? hi(ai? = a*), 
a?? = A® — A’, At= 1 (ai _ a**), 


4 (ail + a**), 


(2.10) 


We then have . Aa ; 
(2.11) gmn A™ A® = a"! g® — a2 a"! = det(a™). 
From this identity it follows that spin-transformations \g*, which leave the 
determinant of an arbitrary Hermitian spintensor a” invariant, induce 
transformations L ,” of Minkowski space-time which leave g,,, A” A” invariant, 
i.e. Lorentz transformations. Now 

det (a’*) = det (a** d,* ,°) = det (a) | det (Ag*) 
Thus we obtain the result: A spin-transformation \g* induces a Lorentz 
transformation if and only if the absolute value of its determinant is unity, i.e. 
(2.12) | det (Ag*) | = 1. 

It is easily seen that the two spin-transformations dg" and Ag*e” (6 any real 
number) induce the same Lorentz transformation. It follows that we may 
limit the spin-transformations to those with determinant +1, without 
reducing the set of Lorentz transformations which are induced by them. 
It can also be shown® that every proper Lorentz transformation can be 
obtained from a spin-transformation. We summarize our conclusions as 
follows: 

A proper Lorentz transformation determines a spin-transformation, which 
satisfies (2.12), uniquely to within an arbitrary phase factor e*. 

Every spin-transformation Ag", satisfying 
(2.13) det (A,*) = 1, 


induces a proper Lorentz transformation. To every proper Lorentz 





2 


*O. Veblen and J. von Neumann, “Geometry of Complex Domains,” Institute for 
Advanced Study mimeographed notes (Princeton, 1936). 





DISCRETE SPACE-TIME 35 


transformation there correspond exactly two spin-transformations which 
satisfy (2.13) and which differ in sign only. 

Spin-transformations which satisfy (2.13) leave invariant the components 
of the real skew-symmetric spintensor «, defined by: 


(2.14) Gieae@=_(9, = —Z=1, Po = 


This spintensor may be used to lower the suffixes of other spinors, and thus 
to introduce covariant spinors c,, dig, etc. as follows: 


(2.15) ct om cy, 
cma, c= —d; 
(2.16) a? = i  a.,, 
ail = ay, ai? = —ajy, a® = —ajz, a = aj. 
In particular, we find that 
(2.17) a =@& = 0, «2 = —ex = 1. 


If the vector A” is associated with the Hermitian spintensor a“ by the 
relations (2.10), and the vector B’ associated similarly with the spintensor 
b**, we deduce easily that 


(2.18) A™ Bu = gmn A™ B" = 5.0 diy. 


3. The Cubic Lattice, Integral Null Vectors, Integral Spinvectors. Consider 
the points in Minkowski space whose coordinates ¢, x, y, z are all real integers. 
The set of these points will be called the cubic lattice. 

The coordinates of a point P of the lattice may be regarded as the components 
of the vector OP which joins the origin O to P. Such a vector will be called 
an integral vector since its components are integers. For most purposes it 
suffices to restrict ourselves to primitive integral vectors, whose components 
have no common factor, as all other integral vectors are multiples of these. 

As in the previous section, we can associate with an integral vector 
t, x, y, a Hermitian spintensor a“ by the relations 


aii=t+z2, ait =x — iy, 
3.01 , ; 
—_ a =x-+ty, a*=t—z, 
and 

t = , (ai! oo a), 
(3.02) . 


hi(ai? = a*), 
z = } (ai! — a*®). 

It immediately follows from (3.01) that the components of a* are Gaussian 
integers. We also see from (3.01) that any common factor of ¢, x, y, s must 
be a factor of all a. Equation (3.02) shows that any factor common to the 
a“, other than a factor of 2, must be a factor of t, x, y, z. In particular, if 
the vector t, x, y, 2 is to be primitive, any common factor of a“ must be a 
factor of 2. 











36 ALFRED SCHILD 


We shall now study integral null vectors, whose components satisfy the 
equation 


(3.03) 2 — x? — y? — g* = 0. 
By (2.11), this implies sks 
(3.04) a}! qg?? = qi? gi?! 


Making use of the unique factorization theorem for Gaussian integers, 
a‘! must split into two factors, of which one is a factor of ai*, the other being 
a factor of a*'. Since ai! is real, those factors can be written in the form 
mc‘ and nc’, where m and nm are real and relatively prime, and where ci = c’. 
Similarly, a®* splits into factors rc? and sc*, rc? being a factor of a®* and sc? 
a factor of ai*, where r and s are real and relatively prime, and where c? = ¢. 
Thus we have 

ait = mncic!, ai? = ms cic, 

a*=rncic', a® = rs cic. 
Since ai? = a®!, we have ms =rn. It follows that m =r, n =s, or 
m= —r,n = —s. Then 


a‘ mnc'c', a = + mnc'c*, 


a = +mncic', a® = mncic?. 
The factor +1 in the second and third of these expressions can be removed 
by absorbing it in c' or in c*. Doing this and writing p for mn, we have 
(3.05) a® = pei’, 
where p is a real integer. Decomposing p in the form (1.04), ie. p = gad, 


we can absorb the complex integer a in both c' and c?, thus reducing (3.05) 
to the form 


(3.06) a”? = gc’, 

where, as is easily seen, g is the product of those real primes, each taken 
once, which are contained an odd number of times in the greatest common 
factor of t, x, y, 2. 

Let us now consider primitive integral null vectors. Since the square of a real 
integer leaves a remainder of 1 or 0 on division by 4, according as the integer 
is odd or even, it is easily seen from (3.03) that, of the components of a 
primitive integral null vector, ¢ and one of x, y, z must be odd, while the 
two remaining components (two of x, y, z) must be even. 

For a primitive integral null vector, g in (3.06) must be +1, and we arrive 
at the following result: 

Each primitive integral null vector determines a spinvector c* with integral 
components c', c*, such that 
(3.07) a® = +c’, 
where the upper or lower sign must be taken throughout. Since cic! and 
c*c? are both positive, we see from (3.02) that ¢ is positive or negative 
according as the plus or minus sign is chosen in (3.07). For primitive integral 
null vectors pointing into the future we have ¢ > 0, and thus 
(3.08) a® = cc. 


DISCRETE SPACE-TIME 37 


Some non-primitive null vectors pointing into the future can also be 
represented in the form (3.08). Whether this is possible or whether the 
representation takes the more complicated form (3.06), with g > 1, depends 
only on the properties of the greatest common factor of the components of 
the null vector. 

From (3.08) and (3.02) it is seen that a spinvector c* with integral 
components determines an integral null vector (t, x, y, z) if and only if 
c', c* are both odd or both even. Such a spinvector will be called an integral 
spinvector. Note that, even if c', c* are integers, c* is not an integral spinvector 
if c' + c* is odd. 

The sum and difference of integral spinvectors are again integral spinvectors; 
the product of an integer and an integral spinvector is an integral spinvector. 
Thus the integral spinvectors form a two-dimensional complex vector space 
with coefficients in the ring of complex integers. It is easy to see that the 
independent integral spinvectors 


(3.09) ea) = (1+7,0), ee = (1,1) 
form a basis; this means that any integral spinvector c can be written in 
the form 
(3.10) c = aeq) + bee), 
where a and 5 are integers, and that conversely any spinvector of this form 
is integral. 

The following theorem can be derived: 

The null vector associated, by (3.08), (3.02), with an integral spinvector 
c* is integral and primitive if and only if one or other of the following two 
conditions is satisfied: 


I c', c? are both odd and relatively prime. 


(3.11) IL c* = (1+ %)d*, 
where d', d* are relatively prime and one of them is even, the 
other odd. 


In the first case ¢ is odd, z is even, and one of x, y is odd, the other even; in 
the second case t, z are odd and x, y are even. By (3.08), +c* and + ic* 
determine the same null vector. 

The criterion which we have just stated solves our basic problem of 
determining all primitive integral null vectors. 

If we drop the requirement that c', c? be integers, we may enquire to what 
extent the spinvector c* is determined by a null vector in space-time. By 
(3.01), a null vector determines a unique Hermitian spintensor a**. Let 
(3.12) a® =F = c*c*, 
or, equivalently, 











38 ALFRED SCHILD 


The first two of these equations show that c’! = cle, c’? = c*e’* (0, real), 
and the last two equations imply @ = ¢. Hence 


(3.13) c* = ce. 


Thus a null vector /, x, y, z, determines a spinvector c* uniquely to within an 
arbitrary phase factor e” . 


4. Integral Null Vectors are Spatially Dense. In (3.08) let us write 
(4.01) c=(1+i (tim) , &=(1L+%)r, 
where ~, g, 7 are real integers. By (3.02) the spinvector c* determines a null 
vector with components 
(4.02) t=p+ +r’, 
(4.03) x=2pr, y=2qr, 2=f?'+¢@ — 2’. 
Equations (4.02), (4.03) determine a discrete three parameter set of integral 
null vectors which are not necessarily primitive. We shall now show that 
the spatial projections of these null vectors, i.e. the directions defined by 
(4.03), are everywhere dense. 

Consider an arbitrary direction Dy in space, and let lo, mo, mo be its direction 
cosines. We can then define real numbers do, bo, co by the equations 


(4.04) lo = 24060, My = QWoco, No = a,” a by? —_ Co’. 

We obtain, by virtue of 1,? + m,? + n,? = 1, 

(4.05) ao = 1{2(1 _ no)\~', bo = my 2(1 —_ no)|~*, to = [41 —_ no)|’. 

It is obvious that do, bo, co can be approximated by rational fractions a, b, c 
such that 1, m, n, defined by’® 

(4.06) l1=2ac, m=2be, n=a’+b*? — ce’, 

are arbitrarily close to lo, mo, mo, respectively. Thus the direction D, with 
direction ratios 1, m, n, makes an arbitrarily small angle with Do. Let the 
integer d be the least common denominator of the rational fractions a, }, c. 
Then p, q, 7, defined by 

(4.07) p=ad, q=bd, r =cd, 

are real integers. If we substitute these integers into (4.02) and (4.03) we 
obtain an integral null line whose spatial component (x, y, z) is immediately 
seen to have the direction D. Since D approximates Do, our assertion is 
proved. 

Having shown that a subset of all integral null vectors is spatially dense, 
it follows, a fortiori, that the same is true for the set of all integral null vectors. 
Since every integral null vector is codirectional with a primitive integral 
null vector, the set of all primitive integral null vectors is spatially dense." 





10 PB + m* + n* is not necessarily 1. 


“By sa that a set of vectors is spatially dense, we mean, more precisely, that the 


directions of the spatial projections of the vectors in the set are dense. This remark applies 
also to Sec. 8. 


DISCRETE SPACE-TIME 39 


5. Integral Lorentz Transformations. A Lorentz transformation 
(5.01) x’ = L,*’x* 
is integral if it maps into itself, i.e. leaves invariant as a whole, the cubic 
lattice which consists of all points with integral coordinates x’. 

Consider the integral vector x* = 4,,", where m is 0, 1, 2, or 3, and where 
5,*isLifs = mand Oifs #m. The transformation (5.01) maps this vector into 
x’ = L,,”. If (5.01) is an integral Lorentz transformation then x'" must 
be an integral vector; thus the components L,,” must be real integers. It is 
obvious that then x’” is always integral whenever x” is. 

Since the determinant of a Lorentz transformation is + 1, the components 
(L~*)," of the inverse Lorentz transformation will be real integers if L," are 
real integers. Thus a Lorentz transformation with integral components L," 
maps the set of all integral vectors into the set of all integral vectors, and 
not into a proper subset of the latter. The following conclusion is immediate: 

A Lorentz transformation L," is integral if and only if all its components L," 
are real integers. 

Thus, by (2.09), our problem of determining all integral Lorentz 
transformations reduces to the solution of 10 quadratic diophantine equations 
in 16 unknown integers. This rather formidable mathematical problem can 
be approached indirectly by considering integral null vectors, spinvectors 
and spin-transformations, as will be shown in this section and the next. 

We shall now prove the following theorem: A necessary and sufficient 
condition for a Lorentz transformation to be integral is that the Lorentz 
transformation, as well as its inverse, map primitive integral null vectors into 
integral null vectors. The necessity of the condition is trivial; we shall 
therefore consider only its sufficiency. 

Consider the four independent primitive integral null vectors 


No = (1,-1, 0, 90), 
Na) = (1, l, 0, 0), 

— Ne = (i, 0, 1, 0), 
Nw, = (1, 0, 0, 1). 

We have 
Eo = (1, 0, 0, 0) = 3(Nw +N), 

(5.03) Eq) = (0, 1, 0, 0) = 4(-—Nw + Na), 

: Ew = (0, 0, 1, 0) = Ne —3Nw +N), 

Ew) = (©, 0, 0, 1) = Na — Nw + Na). 


A Lorentz transformation, satisfying the hypothesis of our theorem, maps 
the vectors N,,) into integral null vectors 


(5.04) N’ (+) = (tr), X(r)s Wr)» Br), 7 = 0, 1, 2, 3, 
and it maps the vectors E,,) into 

E’(~) = $(N’ co) +N’), 
(5.05) E’ag = -N@+N'q), 


E’ @) N’@) — 4(N’o@ +N’), 
E’i;3) = N's — 4(N' wo +N’). 








40 ALFRED SCHILD 


If N’(.) = M’:,)d, where d is an integer and M’:,) an integral null vector, 
then, by hypothesis, the inverse of the Lorentz transformation considered 
here maps M’,) into an integral null vector M,:,), and therefore maps N’,) 
into N(,) = M.,)d. But, by (5.02), the vectors N,,) are all primitive. It 
follows that d must be a unit. Hence the integral null vectors N’,,) must 
be primitive. Then by Sec. 3, ¢:,) and one of x(,), ¥(-), 2(-) must be odd, 
the other two being even. 

Since the scalar product of two vectors is an invariant, we have 

Zmn N’ 0)” N’ a)" = £mn N (o)™ Na", 
or 
(5.06) to) ta) — Xo) Xa) — Yo) Ya) — 20) Za) = 2. 
Thus the left-hand side of this equation is even. Combining this fact with 
the last statement of the preceding paragraph, we see that to) 4q@) and one 
of the three products x(o) x@), ¥(o) ¥@), 2(0) 24) must be odd. In order to be 
definite, let us take x(o) xq) odd. Then fo), tq), x(@), xa) are odd, and y%), 
ya), 2), 2a) are even. It follows that tq) + tw), xa) + Xt), Ya) + Veo), 
2) + 2(0) are all even integers. Hence E’,,), defined in (5.05), are integral 
vectors. 

Applying the Lorentz transformation (5.01) to the vectors E,,), given by 
(5.03), we obtain 
(5.07) E’(,n*=L,*. 

Thus L,* are integers and the sufficiency of our condition is demonstrated. 


6. Integral Spin-Transformations. It seems legitimate to deduce from 
the preceding theorem that a spin-transformation is associated with an 
integral Lorentz transformation if and only if both the spin-transformation 
and its inverse map integral spinvectors into integral spinvectors. It must 
be pointed out, though, that this statement is not a priori obvious and that 
we must proceed with caution. The reason is that a spinvector is not 
uniquely determined by a primitive null vector in space-time, but is 
determined only to within an arbitrary phase factor. However, we shall 
show now that the statement made above is true if the spin-transformation 
is taken with a suitable phase factor. 

If a spin-transformation and its inverse map integral spinvectors into 
integral spinvectors, then the corresponding Lorentz transformation is integral 
since it and its inverse map primitive integral null vectors into integral null 
vectors. It is therefore sufficient to show that, given an integral Lorentz 
transformation, a spin-transformation can be found which represents it and 
which, as well as its inverse, maps integral spinvectors into integral spinvectors. 

Let L,” be an arbitrary integral Lorentz transformation. Since L,” are 
integers, it is clear, by (5.01), that the greatest common factor of the 
components of an integral vector x” is a common factor of the components 
of the transform x’’ of x” under the integral Lorentz transformation. Since 
x” can also be obtained from x’’ by the integral Lorentz transformation 





~~ —- » TH ~*~ 





DISCRETE SPACE-TIME 41 


(L~*),", it follows that the x" and the x’” have the same greatest common 
factor. In particular, we have that, if an integral null vector can be represented 
in the form (3.08), the transform of this null vector under any integral 
Lorentz transformation can again be so represented. The following is easily 
deduced: 

If A," is a spin-transformation which represents an integral Lorentz 
transformation L,” then Ag* maps any integral spinvector into a spinvector 
which differs from an integral spinvector by at most a phase factor. 

Therefore, if we introduce the two spinvectors 


(6.01) eq) = (1 + 4, 0), ea) = (1, 1), 
we must have 
(6.02) Ag*ea)” = eo *e')*, Ag*eca)” = e~"e' (s)*, 


where e’q), €’@), are integral spinvectors. Since Ag" is determined by L,’ 
only to within an arbitrary phase factor, we can choose this phase factor 
so that, in (6.02), ¢ = 0. We can then write: 
(6.03) (A“)p"e’ay® = ea", (A p*e’ a)” = e* ea)", 
where (A~')g* is the inverse spin-transformation which exists, by (2.12), 
and which represents the integral Lorentz transformation (L~'),’. 

From (6.03) we obtain, on addition, 
(6.04) (A~*)g"(e’ a” + e'()*) = eq)* + e*eq)*. 
Since e’«)* + e’«@)* is integral, eqa)* + e”e~)* must be of the form e”(p,q), 
p and q being integers and y real. Thus, by (6.01), we have 
(6.05) 1+it+e*=e"p, 
(6.06) e” = eq. 

From (6.06) it follows that |g} = 1 and hence that g and u = 1/g are units. 
Then (6.05) can be written 
(6.07) 1+ = e*(up — 1). 
Taking absolute values, we find that |\up — 1| = 2' and thus up — 1 must 
be one of 1 + i, 1 — i, —1 +4, or —1 — i, since these are the only integers 
of absolute value 2’. In either of these cases e~” is a unit, by (6.07). Then 
e’@)” = e~* e’@)* is an integer and we can rewrite (6.03) as follows: 
(6.08) Ae*euy” = eq)", Ae"ew)” = e'@)", 
where e’’q)* = e’q)*. Since an arbitrary integral spinvector can be written 
in the form (3.10), we immediately see that \,* maps integral spinvectors 
into integral spinvectors. Similarly, corresponding to (L~'),’, there must 
exist a spin-transformation yg* which maps integral spinvectors into integral 
spinvectors. But yg" can differ from (A~"),* by at most a phase factor e™ and, 
since (A~")g* maps e”’ (2) into @@), ug" maps e”’~) into an integral spinvector 
e™ eq). It follows that e* is a unit and that therefore (A~"),* maps integral 
spinvectors into integral spinvectors. This establishes our assertion. 

Let us denote by v the determinant of g"; we know, by (2.12), that |»| = 1. 
Writing (6.08) in the form 
(6.09) Ase)” = eG)", 











42 ALFRED SCHILD 


and taking determinants on both sides, we have 

(6.10) v(1 + 4) = det(e’’:,)*), 

by (6.01). The right-hand side of (6.10) is obviously an integer. By the 
same argument as that applied to (6.07), it follows that » is a unit, i.e. 
v=1, 0 =i, or v = —1, v = —i. In the latter two cases we can absorb 
the phase factor i in \,*, thus reducing these cases to the first two. 

It is now clear that every proper integral Lorentz transformation is 
represented by two spin-transformations, differing in sign only, which satisfy 
the condition 
(6.11) det (Ag*) = 1 or i, 
and which are such that both the spin-transformation and its inverse map 
the two integral spinvectors e€q) and @q@), given by (6.01), into integral 
spinvectors. Conversely, the conditions just imposed on a spin-transformation 
are sufficient to insure that the spin-transformation corresponds to an integral 
Lorentz transformation. 

Spin-transformations which satisfy the above conditions will be called 
integral spin-transformations. We shall now obtain the conditions on integral 
spin-transformations in a more explicit form. 

If det (Ag*) = 1, (6.11), we have 


(6.12) (A7~*)y! = Aa?, (A7*)o? = — Aa?, (A7W4)1® = — Ar? (AW*)o? = Ad! 


On transforming @€q) and @@) by Ag* and by (A~")g* we obtain the following 
four spinvectors: 


(6.13) ((1 + a)aa!, (1 + Ai), (Ar? + Ao!, Aa? + Ag”), 

((1 + Ax, — (1 + 4)Ax*), (As® — Ast, — Ar? + As). 
If det (Ag") = 4, (6.11), we obtain by the same procedure four spinvectors 
which differ from those in (6.13) by unit factors only. Thus, in either case, 


each of the spinvectors (6.13) must be integral, i.e. the two components must 
be integers, both odd or both even. We easily deduce the following result: 


A spin-transformation dg" is integral if and only if one of the following four 
conditions is satisfied: 


I Xg* are integers such that 


(6.14) Ai! de? —_ re! Dh = & 

and such that dy" + d2' + Ax? + Az? Gs even. 
(6.15) de" = up"/(1 + 4), 

where ug" are odd integers such that 
(6.16) a’ wo” — po! wr? = 21. 


III Ag* are integers such that 
(6.17) A! dA” aed r,! A? = i, 
and such that \;' + do" + Ay? + Az-? ts even. 








DISCRETE SPACE-TIME 43 


IV Ag” = up’/(1 + 4), 
where yg" are odd integers such that 
(6.18) fa! po” — po’ ws? = — 2. 


In cases II and IV, (6.16) and (6.18) are, by (6.15), equivalent respectively 
to (6.14) and (6.17). In these cases the condition that the sum of the A,* be 
an even integer need not be stated separately since it follows from the other 
requirements, as can be seen by examining the possible remainders of yg* on 
division by 2. 

The integral spin-transformations of types III and IV can be replaced by 
spin-transformations of determinant +1 if the phase factor e~**/4 = 
2'/(1 + 4) is introduced. This procedure has the disadvantage of intro- 
ducing the irrationality 2', but it has the advantage that the resulting 
spin-transformations together with those of types I and II form a group. 

From the discussion of the diophantine equation (6.14) in Sec. 1 we see 
that there is a discrete sixfold infinity of integral spin-transformations of the 
type I. Similarly, it can be shown for each of the types II, III, and IV, that 
there is a discrete sixfold infinity of integral spin-transformations. Since 
there is a 2-1 correspondence between integral spin-transformations and 
proper integral Lorentz transformations, we have: 

The group of proper integral Lorentz transformations is a discrete, sixfold 
infinite set. 

We have not hesitated to “count’’ the order of infinity of the integral 
Lorentz group because this emphasizes the large number of integral Lorentz 
transformations. However, since we are dealing with an enumerably infinite 
discrete group of transformations without infinitesimal elements, the statement 
that the group is sixfold infinite has no invariant significance and must not 
be taken too literally. A different parametrization of the elements of the 
group may easily result in an order of infinity other than six. 


7. Equivalence of Primitive Integral Null Vectors. We shall now prove 
that, given two primitive integral null vectors, an integral Lorentz transformation 
can be found which maps the one into the other. Thus all primitive integral 
null vectors are equivalent in the sense that no single such vector possesses 
an invariant property which is not shared by all others. 

In Sec. 3 we saw that if the vector (/, x, y, 2) is a primitive integral null 
vector, then ¢ is odd and one of x, y, z is odd, the remaining two components 
being even. A primitive integral null vector with y or x odd is mapped into 
a vector with z odd by the proper integral Lorentz transformation which 
cyclicly permutes the x —, y —, and z — axes once or twice. It follows that 
it is sufficient to prove the italicized statement for the case where the two 
assigned primitive integral null vectors have odd z — components. Then, 
by Sec. 3, the two null vectors are represented by spinvectors of the form (3.11): 
(7.01) cl = (1 +)d', c? = (1 + 4)d’, 
where d' and d? are relatively prime integers of which one is even and one 











44 ALFRED SCHILD 


odd. It is obviously sufficient to show that there always exists an integral 
spin-transformation mapping an integral spinvector of the type considered 
into the spinvector @€q) = (1 + 4, 0). 

Consider the spin-transformation 


(7.02) Ae & = eq)". 

By (7.01), we can write this 

(7.03) Ai! d' + A,' d? = 1, 

(7.04) Ay? d' + Ap? d* = 0. 
The last equation is satisfied if we put 

(7.05) A? = — d*, dr? = d'. 


Then equation (7.03) is identical with the condition (6.14). Since, by (7.05), 
dx’ and A,’ are relatively prime integers and \,” + A, is odd, we can, by Sec. 1, 
find integers \,' and },' satisfying (7.03), or equivalently (6.14), and such that 
A,' + Az! is odd. It follows that A,' + A! + A,? + Ag? iseven. Thus conditions 
I (Sec. 6) for integral spin-transformations are satisfied by \g* and our proof 
is complete. 


8. Integral Time Lines are Spatially Dense. Jntegral time lines are the 
transforms of the t-axis (x = y = z = 0) under integral Lorentz transforma- 
tions. A primitive integral vector having the direction of an integral time 
line will be called a primitive integral time vector; it is the transform under an 
integral Lorentz transformation of the vector 


(8.01) Evo) = (1, 0, 0, 0). 


Thus far integral Lorentz transformations have been regarded as mappings 
of space-time into itself, which map the points of the cubic lattice into other 
lattice points. However, an integral Lorentz transformation can also be 
regarded as a change to a new coordinate system, such that the points of the 
cubic lattice have again integral coordinates with respect to the new coordinate 
axes. Such a coordinate system will be called an integral Lorentz frame. 
Integral time lines are merely the ¢-axes of integral Lorentz frames. 

Before we investigate the main theorem of this section we shall consider 
briefly the velocities associated with integral time lines. By ‘“‘velocity’”’ 
is meant the velocity of a particle whose world line coincides with the integral 
time line, or, equivalently, the velocity of a particle at rest in the corresponding 
integral Lorentz frame. 

The components ¢, x, y, z of a primitive integral time vector satisfy the 
diophantine equation 
(8.02) P—x-—y—2?=1. 


The velocity v associated with this integral time vector is given by 


2 2 o2\} 
(8.03) v= (= +% + =) 











DISCRETE SPACE-TIME 45 


By (8.02), this reduces to 
iy? 1 
= -. = -_ « = } 

(8.04) v= (1 ‘) = i 1)’. 
Since ¢ must be an integer we see that the only possible velocities are, for 
tS | S oer 
(8.05) v=0, $3', $2', 3(15)',.... 
Remembering that we have chosen the velocity of light c equal to unity, 
we see that the velocities (other than zero) associated with integral time lines 
are very high, the smallest velocity being $3' = 0.866 times the velocity of light. 

An example of an integral time line, associated with the minimum non-zero 


velocity $3', is given by the transform of the t-axis under the integral Lorentz 
transformation 


2 
(8.06) L.* = : 
1 


ore 
—_—_— CO 
-_-o KK = 


We now proceed to show that integral time lines are spatially dense. 
The vector Eo) (8.01) is associated with the Hermitian spintensor aw)”, 
given by ; 

(8.07) A(o)"' = aco)” = 1, a(o)'* = aco)” = 0. 
The integral spin-transformation Ag* of type I (Sec. 6) maps ao)“ into the 
spintensor a“, given by 
ail = A A + he! Ae, ai? = Ry A+ de! he”, 
a! = ),? Ay! + he? re}, ai? 2 hi? + re? e?, 
and a“ is in turn associated with the primitive integral time vector whose 
components /, x, y, z are given by 
t (Ax! Aa! + Da! As! Ag? Aa? + Ae? 2”), 
(8.09) x + iy = da? As! + As? As!, 
= 4(Aa" Aa! + Aa! Aa! — Ar? Au? — As? As’). 
We shall now show that the spatial projections of the integral time vectors 
(8.09) are dense, or, equivalently, that the expression 
pl 2(A:? Ail + Az? 2") 
2 Ag? Ag! + Ag! Ag! — Ar? Ag? — Ag? Ag? 
can be made to approximate to an arbitrary degree any preassigned complex 
number 8, which we may assume to be non-zero. 

Given an arbitrary non-zero complex number 8, we define a by the equation 

2a 


® —1° 


(8.08) 





x ° 
(8.10) - +1 


(8.11) B= 





We may take a to be 
(8.12) a = (1 + (1 + 68)']/8, 
so that |a| > 1, and therefore aa — 1 > 0. 











46 ALFRED SCHILD : 


It is obvious that, given a small positive «, we can find complex integers 
Ai', and \,?, which are relatively prime, such that 
Ai! € 2 
< -— ’ Xr 2 >- 
| <, [at] >?, 
and such that A:' is even. Then, by Sec. 1, non-zero integers A”, As’, can be 
found, satisfying 
(8.14) Ai! d2? —_ dz! A? = a. 
and such that A:' + As! + Ai? + As? is even. Then A,* are the components 
of an integral spin-transformation of type I (Sec. 6). From (8.14) we obtain 


(8.13) 




















8.15 w-*|-| |<! 

a Mi? Atl Aa? e?] 2” 

by the second inequality of (8.13). Combining (8.13) with (8.15), we have 
d2? 

(8.16) a— ? <e 








Thus both A,'/A,* and d:'/A? approximate a. Substituting in (8.11), we see 
that the number 8 is approximated by the two fractions 

2A;? A! 2X2? As? 
= : > on = = : 
Ai? A? = A? Ai’ Ae! re! = A2? de? 
the two denominators being positive, since da — 1 > 0. It follows that @ is 
approximated by the fraction which is obtained by adding the numerators 
and denominators of the fractions (8.17), i.e. 8 is approximated by (8.10). 
This completes the proof that we can find integral time lines whose spatial 
projections approximate any preassigned direction in space. 

Since every integral time line is codirectional with a primitive integral 
time vector, we deduce the following theorem: 

The set of all primitive integral time vectors is spatially dense. 

We add, without proof, the statement of a more general theorem which 
can be verified by arguments more complicated than, but quite similar to 
those just given above. 

Consider an integral Lorentz frame and any integral vector. The transforms 
of the integral vector under all integral Lorentz transformations form a set which 
is spatially dense. 

The preceding theorem is a special case of this. So also is the theorem of 
Sec. 4, once the equivalence of primitive null vectors (Sec. 7) is established. 

The author wishes to thank Professors H. S. M. Coxeter, R. J. Duffin, 
R. P. Feynman, L. Infeld, and J. L. Synge for interesting discussions and 
helpful suggestions. 

This work was completed last winter when the author was a Frank B. Jewett 
Fellow, resident at the Institute for Advanced Study, Princeton. It is a 
pleasure to thank the Frank B. Jewett Fellowship Committee for their award, 
and the Institute for its hospitality. 








(8.17) 


"ws mm Om a 


re oa - me 


J) —™ © 6 6 OO WD me 


f~—~s 


DISCRETE SPACE-TIME 47 


APPENDIX 


Professor H. S. M. Coxeter was kind enough to show me some independent 
work of his which is essentially equivalent to our problem of finding all integral 
Lorentz transformations. He considers a lattice in hyperbolic 3-space consisting 
of the points of our cubic lattice which lie on the unit “sphere” 

(A) 2 — xg? — y? — 2? = 1. 
The congruent transformations of hyperbolic space which leave this lattice 
invariant as a whole are exactly our integral Lorentz transformations. 

Coxeter chooses as his basic operation the reflection in 4-space which 
consists of adding the quantity t — x — y — z to each of the four coordinates 
t, x, y, 2 of a point. In our notation this transformation is given by 

2-1-1-1 
acm 1 0-1-1 
(B) L,’ = on oe er 
1-1-1 0 
This is easily seen to be an integral Lorentz transformation. Combining 
iteration of this transformation with the trivial operations of permuting the 
spatial coordinates x, y, zs and of changing the signs of any of the coordinates 
t, x, y, 2, all integral Lorentz transformations (including reflections) are 
obtained. 

This procedure may simplify slightly some of the proofs in this paper. 
For example, to show that primitive integral null vectors are equivalent, 
take such a vector (t, x, y, 2) and by changing signs make certain that #, x, y, z 
are all positive or zero. Then so long as ¢ > 1 at least two of x, y, z must 
be non-zero since (t, x, y, 2) is assumed primitive. Hence we have 

t= (x? + yy? + 2%) <x ty te S [3(x? + 9? + 2°}! < 2. 
It follows that — t << ¢ — x — y—2< 0. Thus performing (B) and changing 
signs again, we obtain an integral null vector whose t-component has been 
decreased. Repeating this process it is clear that we must finally arrive at 
one of the forms (1, 1, 0, 0), (1, 0, 1, 0), or (1, 0,0, 1). Permuting the spatial 
coordinates we can reduce the given primitive integral aull vector to the 
standard form 
(C) (1, 1, 0, 0). 
This establishes the theorem of Sec. 7. 


University of Toronto 











NOTE UPON THE GENERALIZED CAYLEYAN 
OPERATOR 


H. W. TURNBULL 


1. The following note which deals with the effect of a certain determinantal 
operator when it acts upon a product of determinants was suggested by the 
original proof which Dr. Alfred Young gave of the property 

(NP)? = @NP 
subsisting between the positive P and the negative N substitutional operators, 
6 being a positive integer’. This result which establishes the idempotency of 
the expression @'NP within an appropriate algebra is fundamental in the 
Quantitative Substitutional Analysis that Young developed. The present 
note, which is couched in the language of determinants, proves a result which 
is equivalent to Young’s alternative statement (PN)* = 0PN. 

These operators P and N take their rise in the theory of groups. In fact let 

b= pit Prot...t+ dr : 
be a partition of a positive integer p into h non-zero parts which are arranged 
in descending order: that is 
pi2 p22 a Pr. 
Let p distinct elements be arranged in the following fashion 


Wy, We... Wp, 

so as to form an array of h rows and p; columns, each row being filled conse- 
cutively from the left and starting at the first column, while each column is 
filled consecutively downwards and starts at the first row. No row can exceed 
in length any row which lies above it, and no column can exceed any column 
which is upon its left. If p,= , the array is rectangular: but usually pi> pa 
and the array has a zigzag boundary upon its right. This array is called a 
tableau. 

Let f(ui, . .., Wp,) be a function of these p elements, treated as p arguments 
cA the function, and let p! expressions be formed by interchanging the argu- 
ments in every possible way. Usually these expressions will be distinct, as 
for instance the 2! expressions f(x, y) and f(y, x) differ, unless f happens to be 
symmetric in these two arguments. Let 4; denote the operation of producing 
the i of these expressions, namely 


5if(ur,..., Wp,) =f(u's,..., U’p) 
where u’;,... denote the corresponding arrangement of the » arguments 
Received February 23, 1948. 


1[5] p. 366. 
48 





THE GENERALIZED CAYLEAN OPERATOR 49 


Ui,...Wp,- There are therefore p! such operations 6; and they characterize 
the symmetric group of order p! Let those p,! distinct operations be per- 
formed which permute the elements belonging to the first row only of the 
tableau. Let the sum of the resulting functions be regarded as the effect of 
a resultant operation, P; say, acting upon the original function: namely 
Pif = Vo f(ur,..., Upy M1,---) = Sf(w's,..., wp M1,-.-) 

where the first p; arguments only are to be permuted, while the remainder are 
unchanged. This summation has p,;! terms. 

Let a corresponding operation be performed for the i‘" row of the tableau. 
By taking the rows successively in turn we thus obtain h such operations P,. 
Since each of these operations affects a distinct set of arguments, the h oper- 
ations are independent of one another. We can therefore combine them in 
any order and form a further resultant operation 

P = P,P;... Py = P2P;...P,r=..., 
which consists of p,' p2!... p,! terms, obtained by all the possible different 
permutations of the elements, each within its own row of the tableau. Since 
these terms are added together, this P is called the positive symmetric group 
associated with the tableau. 

In contrast to this a new operator JN, is defined, with reference to the first 
column of the tableau, and consisting of h! terms caused by the complete set 
of permutations among the elements of this column: only in this case each 
term that belongs to an odd permutation is accompanied by a negative sign, 
and otherwise by a positive sign: namely 

Nif = 2(—)j6;f(ui,.. -) 
where the summation has hk! terms, 6; denotes the typical permutation of 
U1, 0,,... Ws, and (—); denotes a positive or negative sign according as the 
corresponding permutation is even or odd. Let ; such operations be defined, 
one for each column, and combined as before into a resultant operation 
N = NiNz... Np, = NoNi... Np, = --- 

which consists of q;!q2! ... terms, where g; denotes the number of elements in 
the j*" column (j = 1,2,..., 1). This N is called the negative symmetric 
group associated with the tableau. If a row or a column possesses a single 
element only, the corresponding factor of P or N may be omitted as it has the 
effect of the factor unity in the whole product. 

When a further operator is made by using P and N in succession the pro- 
ducts PN and NP usually differ. They do however satisfy the same quadratic 
relation X*= 6X, where X = PN or NP, as already mentioned. One more 
preliminary remark should be made, before turning to the application of this 
theory of Young’s Substitutional Operators: namely, that the expression 
f(us, ...) upon which the operator takes effect may be construed in a most 
general sense, provided only that each particular arrangement of the p elements 
u,,... defines the expression and that they make sense when they are per- 
muted. For instance f might be a determinant, and the u; might denote suf- 
fixes which indicate the columns of the determinant. 








50 H. W. TURNBULL 


The connexion between the abstract analysis and the determinantal theory 
is as follows. The m X m determinant 2 + x1;%22... Xn, may be written 
Nid, where @ is X1:%22... Xan and N, is the operator which permutes the n 
second suffixes of the x;; in all possible ways and sums the results accompanied 
by a negative sign for each interchange of a pair of suffixes. A product ¢ of v 
determinants may consequently be written N@ = N,N,... Nod, where N; is 
the operator which generates the j** determinant from its leading term. If 
the determinants which compose the product N¢ are not all of the same order, 
the factors are to be arranged in a descending order. The operation P, is then 
that which generates a sum of v! such terms N@ by permuting the first columns, 
one from each of the r determinants, in all their different ways and adding 
together the results: P, likewise permutes all the second columns; and so on 
until all the columns are so treated. Then P = P,P,..., and PN¢ is the 
final expression. This positive substitutional operation is reflected, in what 
follows, by taking a single product of determinants and making all first columns 
that occur the same; and soon. Except for a factor v! the two expressions are 
substitutionally equivalent. Again, instead of taking a product ¢ of deter- 
minants whose orders may differ, all the factors have been brought up to the 
same order m X n, by the introduction of arbitrary constant borders, in dis- 
tinction from which those elements x;; that undergo permutation (or, equi- 
valently, differentiation) are called the variables. Young's formula is implicit 
in (13) below. 


2. Let x, x2,...,X, denote m sets of m independent variables such that x; 
denotes the i” set {xj1, xi2,..., Xin} when it is arranged in a column. Let 
A =(x:x2...xX,) denote the m X n determinant of these m columns in this order, 
so that A is a function of m* independent variables x,;;. Let 2 = (0/dx,... 
0/dx,) denote the corresponding determinant when each element x;; is re- 
placed, in its own position, by the corresponding differential operator 3/@x;;: 
thus 0/dx; denotes the column of the operators which correspond to the 


) 
OXn 


Let 2 0:8/8xni = (« 
denote the polar operator which substitutes a set of m arbitrary constants a; 
for the set of variables x,. Since A is a linear form in the m components of x,, 
it follows at once by differentiation that 

(a\d/Ax,) A= (x1%X2 eee Xn—10) 

which we abbreviate to (X,-:a). More generally, and by further such polari- 
zations of the x;, let r of these sets, say the last r, be replaced by r columns of 
arbitrary constants, namely 
(1) A, = (Xi%2.. . Xn— BiB. . . Br) 
which we write as AzA = A, = (X,_,A,), where Az denotes the operator which 
substitutes the block A, of r columns for Z the block of the last r columns of A. 
(The above single column operator is therefore written as a,, with z = x,.) 

If this is done for the first n — 1 values of r we obtain altogether n different 











~~ 








dete 
but } 
colu 


(2) 

of tl 

tive 
Fy 


min: 








~~ 








THE GENERALIZED CAYLEAN OPERATOR 51 


determinants, each of which involves the first column x, of the variables, all 
but one involve the second column x2, and so on, until A alone involves the last 
column x, only. Let a power product 


an-1 
(2) @ = A 0A... .A?e-1 = TT (X,_,A,)*" 


r=0 
of these determinants be constructed, where the exponents , are zero or posi- 
tive integers, and where all the blocks of constants A , are arbitrary. 

For example @ = (xyz)?(xya)*(x@y)”" is such a product of three rowed deter- 
minants. 

It is well known, and indeed it is a fundamental result in the theory of 
projective invariants, that the effect of the Cayleyan*® operator 2 = |8/ax;;\, 
already mentioned, acting upon a perfect p** power of A, is analogous to 
ordinary differentiation with regard to A and yields the identity 
(3) 2A°= p(p +1)... (p +n — 1a. 

The object of the present note is to extend this property to the more general 
power product ¢, and to shew that 

(4) Qh = polPot pit 1). . (Pot pit. .-+ Pn-it m — 1)¢dr, 

where ¢,4 = 4¢, that is to reduce the index p» by unity while leaving the remain- 
ing indices unchanged. Naturally if >= 0, 2¢ vanishes. 

To prove this we shall first establish a more general theorem. In fact let a 
set of positive integers \, be introduced where 

Ar = polbo + Pit 1)... (Pot Pit... + hri14+7 — I), 


with r = 1,2,...,m. Fromann X n determinant of arbitrary constants let 
the last r columns be chosen and called B. Furthermore let 
(5) Bz = (Dybe eo 0/dz, 0/dz2 ees 0/dz,) 





denote the bideterminantal (or compound inner product) operator obtained 
by combining the r columns of B with the last r columns of 2. Here for con- 
venience the (n — r + 1)“ set x has been renamed 2;, and so on until the last 
x, is the same as z,. With this understanding the following result holds: 


(6) THEcREM. Bzd = A-(Xn_,B)¢i. 

Proof. We proceed by induction upon r. For if r = 1, and } denotes a 
single column and z denotes x,, then, by differentiation, 

b,A?0 = poA?o" b,A. 
But since A = (Xy-12), 0,4 = (Xq_-10). 
Hence b,A?0 = po(Xn_-1b) A?o', 
that is b.6 = Ay(Xn_-1b)d: since z is absent from all the remaining factors 
belonging to ¢: which proves the result when r = 1. By assuming it true for 
r we shall prove it true forr +1. To do this, write 
Xa-r = XY, Y = Xa-r, 
so that y denotes the last of the m — r columns x, and X denotes all the earlier 
columns. The original set of » columns is now exhibited by 
A = (XyZ). 


? (2). 











52 H. W. TURNBULL 


Let Ag= (XyB). The assumed identity is therefore 
(7) Bzo = dArApA”9 Ay". . . Ar?*N 
where N denotes all those factors into which the column y does not enter, since 
in (2) y does not enter A, whenever s > r. Now each of ther + 2 (unrepeated 
and repeated) factors Ao, A, ..., Ar is of the form 
(XyT) 
where the block of r columns T differs but X and y are always present in each 
factor. Operate with c,, that is 2c; 3/dy;, upon both sides of the equation (7). 
On the right-hand side we obtain a sum of r + 2 terms, one for each different A. 
Thus, the affected parts in the various terms are 
Cydo = (XcB), cy(XyT)?* = p.(XcT)(XyT)? =". 
Now perform* the determinantal permutation {c, B}’ which consists of r + 1 
terms interchanging c with each of the r columns of B in turn, accompanied 
by a change of sign, and adding the term (the sfatic term, let us say) in which 
¢ remains unmoved. The result of this upon the left member c,Bz¢ of our 
equation produces the corresponding operator of order r + 1, namely 
0 a4 ns é 
{c, B}’ c,Bz = (x ‘e) = (co.bs. i. By On Oz, oa 2), 

as is seen at once on expanding this last determinantal expression by its first 
column. 

On the right there are r + 2 terms, as already seen. In the first term Ao 
has been altered to (XcB), and in any other term a single A,, say, has been 


altered to (XcT) multiplied by p,. The effect of the new operation {c, B}’ 
on the first term produces 











(r + 1) (XcB) 
from C,Ao, merely by deranging the r + 1 columns of cB within this deter- 
minant. In each of the other r terms the new operation convolves the columns 
c, B which occur entirely within A, and A» respectively. But by the funda- 
mental identity‘ 
(c, B)'(XyB)(XcT) = (XcB)(XyT) 

that is, the operation interchanges the c, wherever it occurs with the y, which 
occurs in A» the first determinantal factor. This restores the full exponent 
bp, to A, for s = 1,2,..., 7, and in the case of A itself restores p).— 1 which 


had dropped to po— 2 through the operation c,. Gathering these results 
together we infer that 


(8) (cB 





= Arzi(XcB) or, 
which is of the same form as the assumed identity but with r + 1 replacing r. 
Since the identity is true when r = 1 this proves it by induction for r = 1, 2, 


.,n. In the last stage when r = m — 1 in (8) the operator factorizes into 





3[4] p. 27. 4[4] p. 44. 





THE GENERALIZED CAYLEAN OPERATOR 53 


(cB)Q, and all the columns of X have disappeared. On taking the arbitrary 
n X n determinant (cB) to be the unit determinant |4;;| the original identity 


Q6 = Andi 
emerges. 


CoROLLARY 1. The same identity is true if each A, that occurs is replaced 
by p- arbitrary blocks A’,, A”,, etc. This follows since, in the above proof, 
no use is made of the value of 7, but only of its extent. 


3. On writing Potpit... +Peri=% 
Pot Pit... + Pa-2 =% 
—_— é# #@& —= = = — a ath ob eee Ue eae 
Po = Gn 
where gi5= Pot pit. ..+ Pai, we obtain the numbers of times which each x; 
occurs in the product ¢, x; occurring exactly gq; times, for i = 1,2,...,m. 


In particular z = x,, and the last column of A, appears g, times. Accordingly, 
if we operate g, times in succession with 2 and apply the theorem, we obtain 
(10) Rind = pohy"1Ay”2... A?"—1 = py, say, 

where yo is a product of positive integers \,, and from which the last column 
x, has disappeared. From (4) we obtain 


{Pot pit)! (Pot pit... + par1+ — 1)! 











eee Gt  Gt...+hat+0—1) 

(gn—1 + 1)! (qi + — 1)! 
11 a hw arr ee , 
= 4" + 1)! (qi +m —1— py)! 


Now let Q, denote the (m — 1)-fold column of operators (each component being 
a determinant of order m — 1): 

{ | 8/dxy 8/dx2....8/Axn—1| } 
and let Cy =(C | 2,) be such as the operator (5) but with r=n—1. Then, by the 
theorem, Cy¥; reduces the exponent ; of A, by unity, and Cy applied p; times 
replaces the X in this factor by C, and introduces a positive integral factor 


mye se (Pi +... + Pai + m — 2)! 
eee 1- 





(Po+1)! at... + Par tm — 2)! 

Now write C,_, for this block C. We may proceed in this way with further 
operators 01 Q,) of this type, where r = n — 2, n — 3 and so on, in suc- 
cession: for the theorem is directly applicable at each such stage, and replaces 
all the X,_, in A,?r by an equal number of C,_, while attaching a further 
positive integral factor u,. If preferred all the C,_, which are p, in number can 
be distinct, for they are arbitrary. The whole operation can now be written 


n—1 


te 


s=1 





2,) 


and, since it is composed entirely of differential operators 0/dx;; and constants, 
the order of its factors is immaterial. On retaining the original » X n constant 











54 H. W. TURNBULL 


determinant (now called ICal) along with 2 we may drop the factor © and let s 
run from 0 to n—1 in the product; for (C,| Qo) factorizes to | C,| @ since |Qo| = @. 
It is then convenient to express the whole operator in terms of the integers 
qi as follows: 


a—l 
(12) tt (Cus | 2.) = (Cayao- ++ an | Paya2- ++ en) = (Co| QQ) = Co. 
s= 
Here the capital suffix Q denotes the multiple suffix g:g2 . . . : and in the latter, 


which is a set of positive integers written in descending order since their first 
differences are the ~; which are 2 0, it is unnecessary to include any zero 
suffixes. This Q therefore denotes a partition {qiq2.. .} of Zqi, written in the 
usual way. Reference to (2) shews that the operator effects the substitution 
of the Co's for the X’s as follows: 


ai 3 n—1i 
(13) IT (C,—.| 2.) = 0 Il (Cy-.A),*, 
s=0 r=0 
where 09 = oui. -- Mr --- #n—1, 2 Numerical constant which is a positive in- 


teger. The more general case when all », of the C,_, are distinct, for each 
value of r, can be written down without serious difficulty (only it is rather 
prolix !). It has the same numerical factor 6. 

Two further corollaries follow at once: 

CoROLLARY 2. Take all the A, which occur in the product ¢ to be non-zero 
portions of the unit matrix [4;;], so that (X,_,A,) is then an (m — r) rowed 
minor of the determinant |x;;|. Thus ¢ is a power product of such minors of 
all orders (every minor of a lower order being a minor within the columns but 
not necessarily the rows occupied by a minor of a higher order, owing to the 
original condition imposed upon the columns x;). Take each C,_, to be the 
complementary portion of the unit matrix so that (C,_,A,)= 1. Then the 
corresponding operator Cg reduces ¢ to the positive integer 6g . 

CoROLLARY 3. Replace each C,_, by the corresponding matrix X,_.,. 
Then (X,_,| @,) is the well-known Capelli operator. The generalized operator 
Xgq will produce two sorts of terms when it operates on any function of the 
xij—(i) intrinsic terms due to differentiating those parts X; of the operator 
which stand in factors to the right of the partial operator 0/dx;;, and (ii) ex- 
trinsic terms due to direct operation on the operand. Since the right-hand 
side member of (13) reverts to ¢ itself on substituting the X for the C, it follows 
that 

extr Xo¢ = O9¢ 
where the notation indicates the extrinsic terms only. 

What happens to the intrinsic terms? Is there a result comparable in beauty 

to the original formula of Capelli? This formula expresses the operator 


~ (x1X2. . . Xs) 7(0/Ox, 0/Ox2...9/OX,); 


(for J = ite... 45, any set of s different integers 1, 2,... ,) asa determinant 
| (x;| 8/ax;) + (n — i), |, i,j = 1,2,...m, 


S[1). 


' 





45 —- © oo 





* 





THE GENERALIZED CAYLEAN OPERATOR 55 


where the first n — 1 integers appear in descending order, finishing with zero, 
as additions to the elements upon the leading diagonal. These additions are 
caused by the intrinsic terms, and the expansion of the whole determinant 
must be taken in the strict order of its columns.* 
As an example of the complete operator acting upon 

@ = (xyz)(xya)*(xBy) 
take Co= (0/dx 0/dy 0/02) . (6e | 0/dx 8/dy)*(t | 0/dx) where 4, ¢, ¢ are arbitrary 
columns, and all the columns consist of three elements each. Then Q, as in 
(12), denotes the suffix row 4, 3, 1 which indicate the numbers of appearances 
of x, y, z respectively in @. Then 

Co = 943:(5ea)*(SBy) = 576(bea)*(¢ By). 


Again, if a8yéet denote the columns of the unit matrix, the result is zero unless 
dea include the three different columns, as well as {8y. For instance 


(0/dx 8/dy 8/dz)(8/dx 8/dy)is(0/dx), @ = 576 
when ¢ = (xyz) (xy)73 x - 
The numerical coefficient 





19 = 9o,2° re 
may be found from the above product pow... un—1 where 
_ (bi + Pin + D! (pi t+... + Pari tmn —# — 1)! 
MO Gu FD! Gua te thea tee DD! 


tor these actual values of the yu; follow directly by repeated use of the identity 
(6). On substituting for the y«; in terms of the gq; we obtain’ 
II (¢-+ r — 1)! 
(14) 09 = My = (@ ; Vii baioy n. 
Il (¢, — q, —7r + 5) s) 


r<s 





This is a positive integer, since each y; is, and it is the well-known cofactor of 
the number fo for m!, namely 


9ofo = n! 
where fg is the group characteristic xo’, or the O“" component of the character 
xo, which was given by Frobenius for the symmetric group of order N = 2qj. 
This number 4, can also be defined* by the determinant 
ae Ha a 
. “| (gi —++ 7)! 99" 
where d;; == 0 whenever g;< i — j and d;;= 1 whenever g;= i — 7. The number 
of rows and columns in the determinant is taken to be the number of non-zero 
suffixes in the set Q = gig2.-- dn (Qi2> G2 2 ete.). 








®Cf. [1], and [4] p. 117. 
"Cf. [5] p. 366. 
5[4] p. 359. A misprint is here corrected from j to i—j. 











56 H. W. TURNBULL 


For example 











_ a ae 3 
45 5! 6! 
ee ee. a es See 
Os | 2! 3! 4! | #576 
1 
0 1 =~ 
1! 


The proof that the determinantal and the product formulae for @ are 
equivalent follows at once on evaluating the determinant by Dodgson’s me- 
thod.* If by compact minor we mean a minor chosen in any manner from any 
r consecutive rows and any r consecutive columns of the original determinant, 
then the method depends upon the systematic condensation of Ag by the use 
of compact minors. Here for instance, if ug denotes 1/89, we have the 
condensation 

Use Up UG 
Wai =| U2 Us ’ oe 
0 1 Ua = 31 


In this and all such sequences of condensing determinants those elements which 
stand within the whole outer border of elements are called pivotal elements 
(us alone is such a pivot in this example). If v is such a pivot and V is the 
3 X 3 minor determinant of which 2 is the central element then V appears as 
an element in the next but two member of the sequence. If v happens to be 
zero, then by definition of Ag, the three consecutive elements which stand in 
the row immediately below that of », 
me See. 
. . PPP 
symmetrically, must also be zero. Thus V also vanishes. When v + 0 the 
usual pivotal process is available (for example t43%31— tso1 = Uagités3, where 
us can be cancelled since it does not vanish). In either case the process is 
definite, and leads to the required result. 








’ | 4431 | = U431- 


REFERENCES 
[1] A. Capelli, Math. Ann., vol. 29 (1887), 331-338. 
[2] A. Cayley, Collected Works, vol. 1 (1845), 80-94, 95-112. 
[3] J. H. Grace and A. Young, The Algebra of Invariants (Cambridge, 1903), 259. 


[4] H. W. Turnbull, Theory of Determinants, Matrices and Invariants (Glasgow, 1928; 
2nd ed. 1945). 


[5] A. Young, “Quantitative Substitutional Analysis,” Proc. London Math. Soc. (1) vol. 34 
(1902), 361-397, in particular p. 364. 


The University 
St. Andrews, Scotland 


9[4] p. 340. 


| 
| 
| 





—_— ~~ -« . ss © - FF 8, 


~~ —_ = 

















ELEMENTARY ALGEBRAIC TREATMENT OF THE 
QUANTUM MECHANICAL SYMMETRY PROBLEM 


HERMANN WEYL 


1. Stating the problem 


A function 9(%:,..., iy) of f quantities 4, varying over the finite range 
i = 1,2, .. ., , is usually called an n-dimensional tensor of rank f. Any permu- 
tation p: 1 > 1’,..., f >/f’ changes this tensor into a tensor py according 
to the equation pn(i:,...,%;) = nliy,...,t%). Thus the permutation p 
appears as a linear operator p in the n-dimensional space = = Z,,, of all 
n-dimensional tensors of rank f. 7 is symmetric if py = » for all permutations 
p, it is antisymmetric if py = 6,.» where 6, = +1 for the even and —1 for 
the odd permutations. Let a linear transformation A in =, 

(1.1) 9 =An, oa(is...ip) = Dealir...igshi... kp) - alk... kp, 
be called symmetric ' if 

a(iy ...47 ;ky... ky) = a(t,... ty > ky... ky) 
for all permutations ». A is symmetric if and only if it commutes with all 
the permutation operators p. The symmetric transformations A form an 
algebra &. The general symmetry problem posed by the quantum theory 
of an aggregate of f equal physical entities is this: 

(1) to decompose the tensor space = as far as possible into subspaces Il that 
are invariant with respect to all symmetric transformations A. 

An epistemological principle basic for all theoretical science, that of 
projecting the actual upon the background of the possible, is here followed by 
asking what happens under any possible Schrédinger law of dynamics 
h dn 
i dt 
operator A = H. We have here ignored the further condition which physics 
imposes on all energy operators A, to wit their Hermitean nature, 

a(k,...ky 5%... 4,) = @(i;...t7;hi... ky). 
Essential for the theory of eigenvalues (terms) and eigenfunctions, this 
condition is irrelevant for our purposes. For what is invariant under all 
Hermitean symmetric transformations stays so even when the Hermitean 
restriction is lifted. As algebraists we are glad to get rid of it. For we 
propose to carry our investigation through in any number field in which the 
equation f!a = 0 for a number a implies a = 0 (field of characteristic 0 or 
of a prime characteristic dividing none of the natural numbers 1, 2, ..., f). 


= An, before taking up the specific law involving the actual energy 





Received February 28, 1948. 
1We shall adhere to this terminology and not use the word symmetric in the sense presently 
to be mentioned under the name Hermitean. 


57 











58 HERMANN WEYL 


It is no wonder that the complete solution of the above symmetry problem 
depends on the theory of representations of the symmetric group of all 
permutations and Young’s symmetry operators.” 

Let =*, =~ denote the linear manifolds of all symmetric or antisymmetric 
tensors respectively. Nature has most wisely put a stop to the breaking-up 
of = into isolated compartments II by letting but one of them, the invariant 
subspace =~, come into existence. Such at least is the case if the f entities 
of which the aggregate is composed are electrons (Pauli’s exclusion principle). 
Thereby the symmetry problem (I) loses its significance for physics. Part 
of it, however, is restored, if the existence of the spin of the electron is taken 
into account but its dynamical influence disregarded—a procedure which is 
at least approximately permissible. The situation is then as follows. The 
argument 7 is replaced by a pair (ip) with the range ¢ = 1,...,m for the 
“‘positional”’ variable 7 and the range p = 1,..., v for the “spin’’ variable p. 
(Actually » = 2 while the positional variable varies over the continuum of 
all possible positions in the physical three-dimensional space.) Set N = nv. 
The possible wave states of the aggregate of f electrons are described by the 
antisymmetric N-dimensional tensors W(iipi,...,isps) of rank f, forming 
the space =~y,, = 2. Moreover we envisage the space = = Z,,, of all 
n-dimensional tensors n(i,...i,;) of rank f, and the space P = %,,, of all 
v-dimensional tensors ¢(pi,...,p,) of rank f. Any symmetric trans- 
formation A in =, 

n'(i,...i,) = Dyalis... iy ski... Rp) - alk... kp) 
induces a transformation A* in Q, 


W (isp, ..-, tsps) = Vyalir.. . ig sha... Ry) - W(Rapr, . . - , Reps). 
The central problem is 
(II) to decompose Q as far as possible into subspaces that are invariant under 
the transformations A* thus induced in Q by all symmetric transformations 
A in &. 
These A* form an algebra %*. It is also true that any symmetric trans- 
formation B in P, 


(1.3) ¢'(pi... ps) = Beb(pi... ps 301... 07) - o(01... 09), 

induces a corresponding transformation B* in Q, 

(1.4) W'(irps,..., tos) = Deb(p... py p01... 07) - Wino, ... , iyo). 

The B* form an algebra 3*. Every A* of U* commutes with every B* of B*. 
Not only the problem (I), but also this new symmetry problem (II) may 

be solved by means of Young’s symmetry operators; cf. GQ, chap. v, § 12. 

However, as shall be discussed here in detail, a more elementary approach is 

available for the physically important case vy = 2. Indeed the decomposition 

of the spin tensor space P = 22,, into irreducible invariant subspaces under 

the algebra $% of all its symmetric transformations B is readily derived from 





*Cf. H. Weyl, Gruppentheorie und Quantenmechanik (2nd ed. Leipzig, 1931) [quoted 
as GQ], chap. V, §§ 1-7 and 13-14. 








THE QUANTUM MECHANICAL SYMMETRY PROBLEM 59 


the classical Clebsch-Gordan expansion. From the algebra 8 in P we may 
pass to its representation 6* in @. Because of the commutability of the 
elements A* and B* of A* and B*, decomposition of the generic matrix of 
B* entails a “dual” decomposition for A*. The deeper lying fact that vice 
versa any linear transformation in 2 that commutes with all B* « B* lies 
in &* is needed in order to show that the latter decomposition is also one 
into irreducible parts. 

All linear transformations (matrices) in a g-dimensional vector space = 
form an algebra I, of order g*, the complete matric algebra of degree g. 
Throughout our investigation irreducibility for matric algebras will be 
sharpened to completeness. Decomposition of a matrix C into two matrices 
CC: is defined by the equation 


|o of 
C= ] 
0 Cal 
1°C, 2°C, 3°C, ... are the abbreviations for C, C\C, C\C\C, ..., and § is 


the summation sign for the addition | of matrices. Let € be a matric algebra 
of order m in 2 g-dimensional vector space =. Suppose that, relative to a 
suitably chosen coordinate system for Z, the generic matrix C of € 
decomposes into m,°C;| m2°C>| ..., the matrix C, of degree g, occurring 
with the multiplicity m, > 0, g = mig: + mog2 +.... The gi? + go? +... 
coefficients of the matrices C;, C2, ..., Cy are linear forms of the m para- 
meters of €. We speak of complete decomposition if these coefficients are all 
linearly independent and thus m = g,*> + g.2>+.... For » = 2 we shall 
prove the following 

MaIN THEOREM. Relative to a suitably chosen coordinate system for the 
space Q, the generic matrix A* of U* suffers complete decomposition 
(1.5) A* = $(v + 1)°A’*. ; 
u and v are two non-negative integers related by the equation 2u+v=f. The 
part A*, of “valence defect’’ u and the corresponding ‘‘valence’’ v occurs with 
the multiplicity v+1. Seted=n—f, i=d+u. The degree g*, of the 
matrix A*,, is given by the formula 





(1.6) ~ye (") (") (n +1) (n+1-—u-W) 
Uy \U/ (n +1 — u) (mn +1 — @)’ 
n ° . . : af 
(") denoting the binomial coefficient site—ei’ Only those u occur in the sum 


(1.5) for which u >0, u >0,0 =n — (u+u) >0. 

Spectroscopically this theorem establishes the existence of non-intercombin- 
ing term systems corresponding to the various valences v. The terms of valence 
v are of multiplicity v + 1. Only when the actually existing weak interactions 
between the spins are taken into account, each term of valence v splits into a 
“multiplet’’ of » + 1 slightly different terms; whereas the weak interaction 
between position and spin accounts for weak intercombinations between the 











60 HERMANN WEYL ~ 


several term systems. The significance of the valence v for chemistry is 
sufficiently indicated by its name. 

After some preliminaries in 2 the decomposition (1.5) is derived from the 
Clebsch-Gordan expansion in 3. Its completeness will be proved in 4 and 5. 


2. Auxiliary propositions 

Schur’s lemma for complete instead of irreducible matric algebras is a 
triviality; nevertheless it may be stated as our 

LEMMA 1. Complete decomposition of the generic matrix C of a matric 
algebra ©, C = m,°C,| m2°C2| . . . | mx°Cn, implies the same for its commutator 
algebra D, D = gi°D; | go°D>| .... But degree and multiplicity are inter- 
changed: the degree g, of C, is the multiplicity with which D, occurs in the 
generic matrix D of D, and the multiplicity m, of C, is the degree of D,. 

As one knows, the commutator algebra of a given matric algebra € consists 
of those matrices D that commute with all elements Cof €. As an abstract 
algebra ¢ the completely decomposed matric algebra € of Lemma 1 is the 
direct sum of a number of complete matric algebras; indeed ¢ consists of all 
h-uples (C,,...,Cx) of arbitrary matrices C;,...,Cx of the respective 
degrees gi,..., ga. We need the following classical proposition, for the 
simple proof of which I refer the reader to GQ, p. 271, Satz (6.1). 

LemMA 2. Every representation of the direct sum ¢ of h complete matric 
algebras is of the form 


(Ci, cees Cr) > m* °C; | eee | m*n°Cn. 


(Here some of the multiplicities m*, may be zero; this will happen if the 
representation is not faithful and hence the representing matric algebra €* 
is of lower order than €.) 

Any antisymmetric n-dimensional tensor 7 of rank f is completely character- 
ized by its components n(i,...¢y) with «: <u <...< uy, and these are 
independent. We have 


n(t,... ts) = 3s > ler... ty) 
for any permutation 4,...i,; of «...+y, 6; = +1 distinguishing the even 


from the odd permutations (: rss * and n(t,...%;) = 0 if the numbers 
be ce 


i,...%, are not all distinct. Hence 2~= =~,,, does not exist unless m > f, 
and its dimensionality is 
n! 


rit. —-— ., 
Y= Aa 
Lemma 3. Any linear transformation in =~ may be written in the form 
(1.1) where a(i,...i;;k:...ks) is antisymmetric in the f arguments i, 
antisymmetric in the f arguments k [and hence symmetric in the f pairs (¢k)]. 
Indeed a linear transformation in =~, 


n(t1...ts) = Dialer... tf pur... Ks) * (ar... Ks) 























THE QUANTUM MECHANICAL SYMMETRY PROBLEM 61 


(with the sum extending over the possible sequences «,; < ... <«y chosen 
from the range 1, 2, ..., #) may be written as (1.1) when one puts 
bbe 
a(4,...47;kh:... ky) eae a! Se ey 
for any permutation 4,...i; of 4 ...+s and any permutation k,... ky, of 
k,...«s, and puts a(i,...i7;k:...&,) = 0 in case the numbers i, .. . i, 


or ky... ky are not all distinct. 
It follows from this lemma that the algebra & of symmetric transformations 
is a complete matric algebra in the invariant subspace =~ of =. 
Any symmetric tensor » may be completely characterized by its components 
n(trte . . . ty) with 4; < ig < ... < ts, and these are independent. On changing 
the labels i,...i, into 4; + 0, i2 +1, #4 4+2,.. 


., t¢ + (f—1) one sees at 
once that the dimensionality of the space =* = 


=*.,7 of symmetric tensors 





equals 
(n + f — 1)! 
Mt(f) = ete 
fi @ 1) 
Set n(t:... i) = ny,.-.-s, if fi of the f arguments 7, . . . ty equal 1, f: of them 
equal 2, ..., f, of them equal m. These numbers n,,...s, corresponding 


to the various partitions f; + f2 +...f, of f can also be used as the inde- 
pendent components of ». A typical symmetric tensor arises from a vector 
(x1,..., Xn) by the formula 
(2.1) n(t1 eee is) m= Xi, --- xis OF Nf,-+--jfn = xy" “s a : 
A linear form /(n) depending on a variable symmetric tensor 7 is to be writ- 
ten as 

K(n) = a a 2 
with a constant coefficient l;,...,, for each partition fi +...+/ fn of f. 
We make the altogether trivial remark that /(n) vanishes identically in 7 
provided it vanishes identically in x by dint of the substitution (2.1). 

The symmetric transformation B = ||b(p:... py ; 01...¢9)\| of the algebra 
% may be looked upon as a symmetric »’-dimensional tensor b(w;, ... ,w,’ 
of rank f, if each pair (pc) is taken as a single argument w capable of v* values. 
Hence the order of the matric algebra 8 in P is M*,(f) [and the order of & 


is M+,.(f)]. The linear transformation ¢ = || t,, || in the »-dimensional vector 
space induces the symmetric transformation B(?), 
(2.2) Dor... py ior--- Os) = bye? “bye, 


in the tensor space P. Considering the v’ coefficients ¢,, as indeterminates, 
we speak of ¢ as the generic element of the linear group ¢ and of ¢ > B(t) 
as the representation {/ of ¢. Equation (2.2) is in complete analogy to (2.1), 
and the “altogether trivial remark’’ made above amounts to the following 

Lemma 4. A linear form /(B) depending on an arbitrary element B of B 


vanishes identically if it vanishes identically in the parameters ¢,, for 
B = B(t). 











62 HERMANN WEYL ' 
As a final lemma we write down a simple formula for the case v = 2, v? = 4: 


Lemma 5. 
1 2 3 
(2.3) mip = CFUU a er® 
where the sum extends over the non-negative members v of the sequence 
ff-2,f—4,.... 
Proof. Verify (2.3) for f = 0, 1 and the relation 
M*,(f) — M*+.(f — 2) = (f + 1)? 





= X(v + 1)? 


for all f > 2. 


3. The Clebsch-Gordan expansion and the decomposition of “%* 

In this section we assume v = 2. 

The symmetric 2-dimensional tensors ¢(p:... pe) (p = 1,2) of rank vo 
(< f) form a linear manifold P+, = 2*2,. of v + 1 dimensions. In agreement 
with a usage established above denote by ¢, the component ¢(9:... p,) 
in which hk of the v arguments p have the value 1 and h — v have the value 2 
(h = 0,1,..., 9). The indeterminate transformation t = \| toe || (p,0 = 1,2) 
in the 2-dimensional vector space induces the transformation 

¢'(p1..- pv) = | oe ty, - (01... Gy) 
in P*,, and thus P*, appears as the representation space of a definite repre- 
sentation Z, of ¢ of degree v+ 1. By multiplying the transformed com- 
ponents ¢’, by a fixed power A* (u = 0, 1, 2, ...) of the determinant A= 
ti: tez — tie tex one obtains a representation A“Z, of ¢ of the same degree 
v+1. Envisage the subgroup {> of {, the generic element of which is the 
substitution 
| a, Of] 
| 


@.1) ter, tee | 0, 

with one indeterminate parameter \. That substitution multiplies ¢, by \” 
according to the representation Z,, by A“** according to the representation 
A“Z,. Hence the coordinates in the representation space II of A“Z, are so 
chosen that they are distinguished by a signature (‘‘magnetic quantum 
number’) w = «+h. This signature is the exponent of the factor A” taken 
on by the coordinate with the label w under the influence of (3.1) and ranges 
over the values w = u, u+1,...,%-+v. [Decomposition of II into one- 
dimensional parts invariant with respect to the subgroup fo of ¢.] 

The 2-dimensional tensors $(p1... pa, Pot+i--- Pats) Of rank a + b (< f) 
which are symmetric in the first a and symmetric in the last b arguments form 
the substratum of the representation Z, x Z» of ¢ of degree (a + 1)(6 + 1). 
The latter breaks up into parts in accordance with the Clebsch-Gordan 
formula 
(3.2) Zax Zp = $A*“Z,, 
the sum extending over all non-negative integers u, v for which 2u +9 = 
a+ band u < min(a, b). This follows by induction from the equation 

Zax Zp = Za+o| A(Z_—1 x Zp—-1). 
A simple proof is to be found, for instance, on pp. 115-117 of GQ. 


tii, tie 




















THE QUANTUM MECHANICAL SYMMETRY PROBLEM 63 


Repeated application of (3.2) leads to a formula of this type: 
Zix Zix...x Z, (f factors) = $2ucA"Z- (2u +v =f). 

Z:x ...%Z, is nothing but the representation {’, t>B(t), of ¢ in P, and our 
formula states that the matrix B(t) breaks up in the manner described by 
(3.3) Bit) = $guBu(t) 
into partial matrices B,(?) of degree v + 1. Here u, v range over all non- 
negative integers satisfying the equation 2u + v = f, and each component 
B,(t) occurs with a certain multiplicity g, > 0. 

If we now make use of Lemma 4, which also states that two linear forms 
1(B) are identical if they become identical by the substitution B = B(t), 


we see at once that the generic matrix B of & itself breaks up in the same 
fashion 


(3.4) B = $¢.°B.. 

Lemma 5 then shows that none of the valences v = f, f — 2, f — 4, ... is 
left out, g. > 0 for 0 < u < }f, and that all the coefficients of the various 
matrices By are independent linear forms of the M*,(f) parameters by, », ;, ;, 
of B. Hence (3.4) is a complete decomposition. 

B* is a representation of 8, and thus Lemma 2 leads to a similar formula 
(3.5) B* = $e*.°B. (g*. > 0) 
for the generic matrix B* of S*. 

It is not difficult to determine the multiplicities g*,, explicitly. Specialize 
the element ¢ of ¢ by (3.1) in B = B(t) and the corresponding B*(t). The 
effect of this specialized B*(#) upon a tensor component (tipi, ..., ips) 
is multiplication by \” if w of the f indices p:,..., py are 1 (and f — w of 
them equal 2). A complete set of independent components of y of that type 
is obtained by choosing 

Pi=...=pw=l Posi =... = pe =2 
ia... <h Pe tae-o2 
Hence their number N , equals 


(3.6) we=(5)-(,7.)-(5)-(5) 


where d = n — f and w = w+d. According to (3.5) the space 2 breaks 
up into subspaces II*, of dimensionality v + 1 in each of which B*(t) induces 
the transformation B,(t). Every one of these g*, subspaces II*,, therefore, 
contributes exactly one coordinate of signature w to @2 provided u < w < 
u+v-=f-—  u. This simple argument yields the recursive formula 
Ny = Xg*, 

where u ranges over all integers satisfying the inequalities «> 0 and u < w, 
u < f — w. Consequently 
(3.7) g*. = Nu — Nu-s (O<u< if). 

Put « =d+u so thatv =n—u-—u. Now (1.6) readily follows from 
(3.6) and (3.7), and one sees from this explicit expression that g*, is positive 











64 HERMANN WEYL 


provided u > 0,u >Oandu+u<n. The range of the valences » actually 
occurring in the decomposition of 8* is thus circumscribed by the relations 


v>0, vsenatd, v =n +d (mod 2).’ 


%* serves merely as a jumping board for A*. But since every A* commutes 
with all the transformations B* of 8* the decomposition (1.5) of the generic 
matrix A* of U* is now inferred from Lemma 1. A definite decomposition 
according to valences is thus obtained, and for phvsics this is the most 
essential result. However, as long as we have not yet convinced ourselves 
that &* is not only contained in, but identical with, the commutator algebra 
of $*, completeness for the decomposition (1.5) is not ensured. In order to 
settle this point (5) one first has to prove that the only operators in P that com- 
mute with the symmetric transformations B are the symmetry operators (4). 


4. Symmetric transformations and permutations 


Our present object is the space = = 2,,; of the n-dimensional tensors 
n(i;...ts) of rank f. The permutations p and any linear combinations of 
them, a = &,a(p)p, are linear operators in =, 7’ = an, which commute with 
all the symmetric linear transformations 7 = An, 


ai, ...tp) = Dyali,... ip ski... ky) -n(ki.. ky). 


We introduce the symmetry quantities a = 2 ,a(p)p (with arbitrary numbers 
a(p) as coefficients)‘ quite independently from their usage as operators in 2. 
They form an abstract algebra of order f!, the “group ring of the symmetric 


group.” 

Let » be a tensor and 4,..., tf a given sequence of integers from the 
interval 1 < i < m. We consider the f! numbers pn(i,...%;) = x(p) as the 
coefficients of a symmetry quantity x = ~n(i,...i,;). The tensor equation 


- 


n’ = an is equivalent with ~7’ = (~7)-é where 4 is the symmetry quantity 
with the coefficients 4(p) = a(p™'). Here ~» may be interpreted as the 
symmetry quantity with the tensorial coefficients ~n(p) = pn, or one may 
replace ~7 and ~7’ in our equation by the ordinary symmetry quantities 
~n(ii... ts) and ~7n’(%,...%s) corresponding to any argument combination 
eS 

The group ring is an f!-dimensional vector space. In it we envisage those 
symmetry quantities ~7(i,...%,;) that arise from arbitrary tensors 7 and 
arbitrary argument combinations (4;,..., #7), and we determine their linear 
closure x = kn, i.e. the smallest linear subspace that comprises them all. 


*In passing we notice that the order of the algebra G* may now be evaluated as Z(v + 1)?, 
the sum extending over the non-negative v of the sequence /f’, f’ — 2, ... where f’ = 
min (n — d, n +d) = min (f, 2n — f), and hence equals (f’ + 1)(f’ + 2)(f’ + 3)/1-2-3. It 
should be easily possible to confirm this directly. 

‘The dot under a letter merely serves to indicate that it stands for a symmetry quantity. 











THE QUANTUM MECHANICAL SYMMETRY PROBLEM 65 


Let y, (s = 1,2,..., 2’) be a basis for the space =. Then the elements x of « 
are given by the equation 
(4.1) = Zidi...i)- ~1is..- i” 
where the £,(%, ...%,) are arbitrary coefficients. Write more explicitly 
x(p) = DE... i) - pris... i) = Up klir... i) - val... fp), 

hence 
(4.2) = Dy (i woe by) ~E abi... Hy). 
Since 7’, = ay, implies ~7’ ,= (~7,) -4 one sees that xé lies in « if x does; 
« is therefore not only an algebra, but even a right-ideal. But in (4.2) one 
may consider £, as a tensor and the y ,(4; . . . iy) as coefficients ; consequently 
% lies in « if x does, and thus « is also a left-ideal. Introduce #’,= at,; then 
(4.2) yields 

£-4 


Lr (ts eee is) ° ~t’ s(t eee ts), 
(4.3) ax = Le sir... is)» walt... iy). 


As a left-ideal « has a generating idempotent ¢. This means that ze is in « 
whatever the symmetry quantity z, and if z lies in «x then z = ze. Similar 
statements hold for multiplication by 2 on the left. The ensuing equations 
é = @e and ¢ = é-e show that e = @. Every tensor 7 satisfies the equation 
en = %- 

One more fact about « is of importance. Introduce as the trace tr(a) of a 
symmetry quantity a the coefficient a(1) corresponding to the identical 
permutation 1. The scalar product tr(ab) = ¥,a(p~")-b(p) is clearly a 
symmetric and non-degenerate bilinear form of the two arbitrary symmetry 
quantities a and 6. This non-degeneracy is preserved under restriction to «; 
i.e. an @ex such that tr(ab) = 0 for every Dex is necessarily zero. Indeed 
let z be an arbitrary symmetry quantity; then 6 = ze is in«, hence tr(az-e) = 0. 
But with a also az lies in «x, therefore az-e = az. Thus our equation turns 
into tr(az) = 0 for every z, and that implies a = 0. 

THEeoREM I. The symmetry quantities a if interpreted as operators in = 
are the only ones that commute with all symmetric transformations A. The 
symmetry quantity a expressing such an operator can be uniquely normalized 
by requiring a to lie in x. 

Proof (cf. GQ, pp. 266-267). Let L be a linear operator in 2, » > Ln, 


‘By using deeper algebraic resources than we care to employ in this elementary approach, 
Theorem I could be obtained as an immediate consequence of the following two facts: (a) 
Every representation @ > a of the group ring of the symmetric group breaks up into irreducible 
parts (is “fully reducible”); (8) A fully reducible matric algebra coincides with the commutator 
algebra of its commutator algebra (R. Brauer).—Another variant: Explicit construction by 
means of Young’s symmetry operators shows that the inequivalent irreducible parts of the 
representation @ -> @ are absolutely irreducible and inequivalent, and consequently (a) yields 
a complete decomposition. With this additional knowledge (8) can be replaced by the trivial 
fact that complete decomposition of a matric algebra implies its identity with the commutator 
algebra of its commutator algebra. 











66 HERMANN WEYL 


commuting with all symmetric A. Let Ly, = 8,, and with the same 
coefficients & (4, ...4,) as in (4.1) form 


y= Ler... is) - ~B.lis... is. 


I am going to show that the equation x = 0 for the arbitrary coefficients 
t (i; .. . #7) implies y = 0. Let 7 be any tensor and set 


@ = L>x(p~")-pn, 6 = Loy(p~") - pr. 
Then 
O(a, . . . ty) = D sled (tr... tg Ra... Ry) -y elk... Ry), 
6(i1...4,) = DeDiaslir... ty ski... Ry) Balk... ky) 
where 


@ (i, ...t7 3k... ks) = Dopnlti... ts) -pé (ki... ks) 
is clearly the matrix of a symmetric operator A, in 2. As A, commutes 
with L we conclude that § = L@. Consequently @ = 0 implies § = 0, and 
* = Oimplies Dpy()~)-pn(t: ... is) = 0, or tr (yy*) = O for every y* ex. 
The quantity y itself is in x, and hence the last equation forces y to vanish. 
This settled, one concludes that the correspondence x > y = Rx defines 
a linear mapping R of « into itself. Formula (4.3) and its parallel 


a-y = DE alin... is) ~B ots... i”) 


prove the mapping R to be a similarity; i.e. it carries ax into ay whatever a. 
Replace x and a by e and x. Setting Re = 4 one finds that x = xe goes into 
x-Re = x4. This statement is equivalent with the n/’ equations 8, = ay:, 
or Ly = an for every tensor 7. The symmetry quantities @ and a lie in «x. 


5. The reciprocity of A* and B* 


In this section » is not assumed to have the special value 2. 
THEOREM II. Y* is the commutator algebra of B*. 

Proof. Let 
C= \| C(tipi,..., tepr 3 Rio, ... , Roy) \| 

be the matrix of any linear transformation in 2 in the unique normalization 
established by Lemma 3. Hence C is antisymmetric in the f pairs (ép), 
antisymmetric in the f pairs (ko), and thereby symmetric in the f quadruples 
(ip, ko). Let X =||x(p1... py 301...o,)|| be symmetric in the f pairs 
(pc). Then CX with the components 


D.c(isp1, cee , tsps + kit, coe, Rytzs)-x( 11 coe TFS OL--- os) 


is certainly antisymmetric in the pairs (ip), and since it is symmetric in the 
quadruples (ip, ke) it is also antisymmetric in the pairs (ko). The same is 
true for XC. Our hypothesis demands that CX and XC coincide as operators 

















THE QUANTUM MECHANICAL SYMMETRY "?0BLEM 67 


in ©. Hence their matrices in normalized form must be identical. For 
fixed 4,...4;;k,...ky the coefficients 
Cpr... pe 5o1...07) = Clispi,..., type 3 Rios, .. . , Ryoy) 

form a matrix \| C(p.... py 31... 7) \| in P which may be denoted by 
C(t, ...t7;h1...ks). Theorem I when applied to P rather than 2 shows 
that this transformation is of the form }pt,p where 

t(p) = tp = to(t,... a7 3k... Ry) 
are the coefficients of a symmetry quantity 


(5.1) t= t(t,...47;hi... ky) 
that lies in x = «x,. Introduce the transformation 
(5.2) Ty = || tolix... a7 shi... Ry) || 


in 2. Our result may then be written in the form 
C= xT, xp), 
the cross indicating the Kronecker product of a matrix in 2 (first factor) 
and a matrix in P (second factor). If we are not afraid of making use of a 
symmetry quantity T whose coefficients are the matrices T, in 2 we can 
express the fact that each ¢ lies in «, by the equations 
(5.3) Te=eT =T, 
e = e, being the generating idempotent of « = x,. 
C is antisymmetric in the pairs (ko). Hence 


(5.4) C(qxq) = 54°C 
for any permutation g. It is antisymmetric in the pairs (ip); hence also 
(5.5) (qx q)C = 6,-C. 


In more explicit furm (5.4) reads 


LelTr4 x pq} = be Lvl{Tpx p} 
or 


(5.6) LolTrc 4x p} = 5¢Lp{T>px p}. 
In order to avoid confusion use for the moment 
Q = || g(i:... ty 5h... ky) || 
as a notation for the linear transformation q in = and its matrix. Set 
T’, = TQ, 
t'o(4.... 87 3Rr... Ry) 
= Ditp(is...t¢sh...ly)-qh...lyshi... ky). 


Given a combination (4... 4, ; ki... ks), the symmetry quantity /’ with the 
coefficients t’{p) = t’p(i:...t7; ki... ks) lies in x, because all the quantities 
t(i,...t¢;h...l,) do. (This is true for any linear transformation Q in =. 


What holds for T’, = T,Q holds likewise for 7”, = QT,.) For a fixed permu- 
tation q the numbers /*(p) = ?’(pq™") are the coefficients of the symmetry 
quantity t* = tg. ||t*p(é,... a; ; #1... ks) || is the matrix T’,.4 = T,,70. 
Hence (5.6) states that /* and 6,-¢ coincide as symmetry operators in P. 











68 HERMANN WEYL 


But ¢* = ?’qlies in x, because ¢’ does; coincidence as operators in P, therefore, 
implies identity of the symmetry quantities themselves, * = 6,-¢ or 

Te Q= 54°T>p. 
Setting g = p, Ti = A, one finds 


T, = 6,-Ap. 
In the same manner (5.5) leads to 
T, = 6,pA. 


The transformation A in = thus commutes with the permutation operators p 
in the same space and is therefore symmetric. Because of the antisymmetry 
of W(i:p1, . . . , ¢ypy) in the pairs (¢p) the equation 
WY = Cy = 2,(T,x p)¥ 
may be written as 
V = Lip (Typ x Dv = flAx Dy 

where J stands for identity, and thus Theorem II is proved. 

The normalizing condition (5.3) takes on the form 
(5.7) éA = At =A, 
é being the idempotent with the coefficients 5,-e(p) = 5p-e(p~'). This, however, 
is no surprise. As a matter of fact, A induces the same transformation A* in 
@ as 6Aé, and hence, whether or not A satisfies (5.7), it can always be so 
modified as to fulfil that relation, without change in the corresponding A*. 

Application of Theorem II to » = 2 shows that the decomposition (1.5) by 
valences is complete. 


Institute for Advanced Study 
Princeton, New Jersey 














ar 


w! 


(1 


jser 














ORTHOGONAL MATRICES IN FOUR-SPACE 
C. C. MacDUFFEE 


Every proper orthogonal matrix A can be written 
A=& 
where Q is a skew matrix [6], and conversely every such matrix A is orthogonal. 
It is also known that every proper orthogonal transformation in real Euclidean 
four-space may be characterized in term of quaternions [1, 3] by the equation 
x’ = axb, Na = Nb = 1. 
Here the quaternion 
x = Xo + Xt + X2j + Xak 
determines with the origin a vector having the coordinates (x9, x:, x2, 3). 
The relationship between these two representations was clearly shown by 
Murnaghan [5]. 

The present paper employs the first and second regular representations of 
quaternions by matrices in place of Murnaghan’s “special matrices,”’ with the 
result that known properties of the regular representations can be applied 
directly to this problem. Incidentally an easy method not using infinite 
series is found for finding the skew matrix Q when the orthogonal matrix A 
is given. 


1. The first and second regular representations of the real quaternion 


a = do + ayt + aej + ask 
are, respectively, 


R(a) = aol 4+ a,R, + aR: — a;3R3, S(a) = aol + aS; + a2S2 + 4353 


where 


0-1 0 0] Tr 0 0-1 OF Beiwy 
1000 0001 0 0-1 0 

R= 0 0 o-1 } *= 100 o p ®= 0 1 0 | 
| 00104 | 0-1 0 0] Ls 0 0 

(1) _ 2 4 7 Y F 
0100 0010 0001 

-1 000 0001 , 0 0-1 0 

a= 00 0-1 » a= -~.0o0o0}™* 0100 
| 001043 | 0-1 0 0] 1-1 0 0 0] 




















Let S? denote the transpose of S. The six matrices R;, Re, Rs, S:7, S27, Ss" 
are all skew and are linearly independent. The most general 4 by 4 skew 
matrix 


Received February 25, 1948. 


69 











70 Cc. C. MACDUFFEE 


0 qo Jor qos 
= 0 qi2 — a 

2 - qo 
( ) Q — Jor — diz 0 q23 


— qos qa — G23 0 
is therefore a linear combination of them. In fact 
Q = — (qu + ges)Ri — $(goe + Gu)R2 — $(gos + Qu) Rs 
—43(go a ges) Si" — 3 (doe ee qu)S2" aa 3 (qos - qi2) Ss. 
Note the analogy of the g’s to Pliicker line coordinates [2]. 
If we let — $(qa + 23) = 71, —43(Qn — ges) = 51 etc., we may write 
p = ryt + rej + rk, o = St + Sof + Sok. 
That is, every skew matrix can be written 
Q = R(e) + S"(o), 
where p and o are pure quaternions. Therefore p satisfies the quadratic 
equation 
(3) x? + Np = 0, Np = r:°> + ro? + 137, 
and similarly for oc. 

The matrix e® is defined as a power series which converges for every matrix 
Q. In every associative algebra, every matrix of the first regular representation 
is commutative with the transpose of every matrix of the second regular 
representation [4]. It follows upon multiplying power series that 

22 = eRe), S7) ST (0) RO) 
Write R for R(p). Then 


=¢€ 


Sa 


From (3), R? = —v?I where v? = Np, v2 0. Hence 


1 1 
Parties tay...4 Kirt 
2 4! 
R 
e® = cos v-I + — sin». 
v 
If we define the quaternion 
(4) 


then clearly 


R 


P . 
= cos v + — sin v, 
v 


R(a) = e®, Na = 1. 


By means of (4) every pure quaternion p determines a unit quaternion a 
and vice versa. Similarly 


© = stig), Né = 1. 
We have proved 


THEOREM 1. Every real proper 4 by 4 orthogonal matrix can be written 
A = R(a)-S™(8) = S7(8)- R(a) 
where a and 8 are unit quaternions. Every such product is orthogonal and proper. 











ORTHOGONAL MATRICES IN FOUR-SPACE 71 


Let us assume a second such representation, 


A = R(y)-S" (8), Ny = Nb = 1. 
Then 


R(y)-R(a) = S7(6)-S-7(8), R(y~'a) = S7(8~8). 
The skew components of these matrices vanish, since the skew matrices 
in (1) are linearly independent. Thus 


R(y~*a) = S™(8-%3) = kl, k real, 

so thata = ky,5 = kB. Since Na = Ny = 1, Nk = k? = 1,k = +1. We have 

THEOREM 2. The pair of quaternions a, 8 of Theorem | is unique except that 
it may be replaced by —a, —8. 


2. The unit quaternion 


(5) a = do + ayi + dej + ask, Ne = 1, 
satisfies the quadratic equation 
(6) x? — 2anx + 1 = 0, 


whose roots are the characteristic roots of R(a). Since the discriminant is 
—4(a,? + a,* + a;”), these characteristic roots are real only ifa = +1. That 
is, unless the orthogonal matrix R(a) is + I, the orthogonal transformation 
which it defines leaves no vector through the origin invariant. But if o is 
any vector through the origin, the plane of vectors kw + k,R(a)-v is invariant. 
For by (6) 
R(a)[kiw + keR(a)-v] = — kev + (Ri + Zaks) R(a) -v. 
Thus R(a) is the matrix of a left Clifford translation. 
Coxeter [1] has shown that in quaternion coordinates the left Clifford 
translation is given by 
x’ = ex, Na = 1, 
where a is given by (5), and 
x = Xo + xt + Xj + xk. 

Upon multiplying out and equating the coefficients of 1, i, 7 and k, we have 

x9 = Goxo — GiX1 — Asx, — xx3, 

x’) = GX + Gor, — Axx, + Aexs, 

x's AeXq + AX. + Aor, — AX3, 

x's = Asko — AoX + AiX2 + Aoxs. 
If we denote by v the column vector with components Xo, x1, %2, Xs, this may 
be written 


v’ = R(a) -2, Nea = 1. 
In the same notation the right Clifford translations may be written 
Y= S™(B) “2, NB = 1. 


3. It has been shown that if A is proper orthogonal, 
A = R(a)-S*(8), 


where a is given by (5) and £ is given similarly. We shall show how a and 8 
can be determined from A. From (1) 








72 Cc. C. MACDUFFEE 


3 3 3 
(7) A =aobol + SY agdjRiS;* + a0 ZF Si" + bo Y ajR;. 


i,j=i i=1 j=l 
Since R; and S;’ are both skew and commutative, their product is symmetric. 
Thus the first ten terms above are symmetric and the last six are skew. Hence 
the unique skew part of A is 
4(A — A®) = aglbiS:” + bS2” + bySs"] + bolaiRi + a2Re + asRi). 
Since the R; and S;’ are linearly independent, we can determine uniquely the 
numerical values of 
Gobi, Aob2, Aobs, beds, bode, bods. 
With the aid of the relations 
ao" + a,’ + a,* + a; = I, bo? + b;? + by? + 5;? = 1, 

we obtain quadratic equations for ao? and b,? and hence the values of the 
eight a; and b;. It is known from Theorem 2 that just two sets of values can 
satisfy (7). 

When a and £ are known, p and o can be found from (4), and then Q from (2). 


REFERENCES 


{1] H.S. M. Coxeter, American Mathematical Monthly, vol. 53 (1946), 136-146. 

[2] H.S. M. Coxeter, Non-Euclidean Geometry (Toronto, 1942), 151-153. 

[3] E. Guth, Amzeiger, Akademie der Wissenschaften in Wien, Mathematisch-naturwissen- 
schaftliche Klasse, July 6, 1933, Nr. 18. 

[4] C.C. MacDuffee, Monatshefte fiir Mathematik und Physik, vol. 48 (1939), 294. 

[5] F.D. Murnaghan, Scripta Mathematica, vol. 10 (1944), 37-49. 

[6] J. H. M. Wedderburn, Ann. of Math., vol. 23 (1921), 134. 


The University of Wisconsin 








a 
a 
t 
r 
I 





ON CONJUGATE CONVEX FUNCTIONS 
W. FENCHEL 


1. Since the classical work of Minkowski and Jensen it is well known that 
many of the inequalities used in analysis may be considered as consequences 
of the convexity of certain functions. In several of these inequalities pairs of 
“conjugate” functions occur, for instance pairs of powers with exponents 
a and a related by 1/a + 1/a = 1. A more general example is the pair of 
positively homogeneous convex functions defined by Minkowski and known 
as the distance (or gauge) function and the function of support of a convex 
body. The purpose of the present paper is to explain the general (by the way 
rather elementary) idea underlying this correspondence. Subjected to a more 
precise formulation the result is the following: 

To each convex function f(x;,..., x,) defined in a convex region G and 
satisfying certain conditions of continuity there corresponds in a unique way 
a convex region I’ and a convex function $(£;, ..., &,) defined in T and with 
the same properties such that 


(1) X11 + +. + Xnén S f(x, re | Xn) + o(é1, re | En), 


for all points (x;, ..., x,) inG and all points (&,..., &,)in I’. The inequality 
is exact in a sense explained below. The correspondence between G, f and 
I, @ is symmetric, and the functions f and ¢ are called conjugate." 

The hypersurfaces y = f(x:,...,%n) and 7 = $(&,...,&,) correspond to 
each other in the polarity with respect to the paraboloid 


2 =x2+...+ x. 


Let F(x) be strictly increasing for x 2 0. Then f(x) = | F(x)dx is convex, 
0 


i 
and its conjugate function is ¢(¢) = | (t)dt where #(£) is the inverse function 
0 


of F(x). The inequality (1) for m = 1 therefore yields the well-known in- 
equality of W. H. Young? 


z € 
xt S [Peas + | ecoae . 


(1) may thus be considered as a generalization of this inequality. 


Received March 24, 1948. 
1The case » = 1 has been considered by S. Mandelbrojt [3] under the assumption that the 
ranges G and I are identical with the entire axis — © <x <@. This, however, is incom- 
patible with the complete reciprocity between f and ¢ which will appear from an example given 
below. Mandelbrojt’s formulation of the theorem is thus not quite correct due to the fact 
that the least upper bounds occurring in it may be infinite. 
*See e.g. [2] p. 111. 








74 W. FENCHEL’ 


If f(x:, ... , Xn) is positively homogeneous of degree one, then G is the entire 
space x;,..., X, while I is closed and bounded, and ¢(§;, . . . , &,) is identically 
zero. In this case (1) expresses that f(x:, ..., xn») is the function of support 


of the convex body I.* 


2. The euclidean spaces with coordinates x;,...,%, and %,,...,%n, ¥ 


will be denoted by R* and R"*' respectively, points and vectors in these spaces 
by x and x, y respectively. Furthermore we write 


x’ + i ct (x’, + x's, sees ¥'s + xn), Ax = (Ax, re | AXn), 
Dxt = Xk, oa eee a Saka. 
6 wiil always denote a number in the interval 0 < @ < 1. 

The point set G of R" is supposed to be convex, i.e. if x’ and x” belong to G, 
the whole segment (1 — @)x’+ 6x” belongstoG. But G need neither be closed 
nor open nor bounded. The interior points of segments belonging to G are 
shortly called the interior points of G. All other points of accumulation of G, 


belonging to G or not, will be called the boundary or extreme points of G. 
A function f(x) defined in G is called convex if 


(2) f ((1 — 0)x’ + Ox") S (1 — 0) f(x’) + Of (x”) 
for any two points x’ and x”’ of Gand all @. It is well known that this implies 
that f(x) is continuous at the interior points of G. For our purpose we have 


also to consider the behaviour of f(x) at the boundary points. Let x* be a 
boundary point of G. For functions of one variable lim f(x) exists or is . 
x->x* 
But this is not necessarily the case for functions of several variables. If x* 
belongs to G the only general conclusion to be drawn from (2) is that 
(3) lim f(x) S f(x*); 
x->x* 

for, from 

f (1 — @)x + Ox*) S (1 — 8) f(x) + Of (x*) 
it follows that 

lim f(x) S lim f ((1 — 0)x + Ox*) s f (x*), 

=->x* o>1 
and (2) remains valid if f(x*) is replaced by any other value satisfying (3). 

If necessary, we now change G and f by adding to G all those boundary 

points x* not yet belonging to G for which lim f(x) is finite and by de- 





x->>x* 
fining f at these and at the boundary points previously belonging to G by 
(4) f(x*) = lim f(x). 
=->=x* 


The new G and the function f obtained in this way are obviously again convex; 
for, let x’ and x” be arbitrary points of the new G and x’,,) and x”’,,,,v=1,2,..., 





See e.g. [1] p. 23-24. 














CONJUGATE CONVEX FUNCTIONS 75 


sequences of interior points of G such that 
x’) > x’, x” «) >x", f(x’ @) > f(x’), F(x" @) > f(x"), 
then we gct from 


f((1 — 0)x’@) + Ox") S (1 — Of (x) + OF (x) 


for vy > 
lim = f(x) S lim f((1—6)x’) + Ox") S (1-0) f (x’) + Of (x”), 

x->(1 — O)x/ + Oxi’ > oo 
which shows that (1 — @)x’+ 6x” belongs to G and that (2) is valid, as the 
left-hand side is f((1 — @)x’ + @x’’) . 

With (3) in mind we may say that (4) expresses that the functions which 
will be considered in the following are convex and semi-continuous from below, 
and G is “closed relative to f,”’ i.e. all boundary points at which lim f(x) is 


finite belong to G, or in other words, at each boundary point which does 
not belong to G we have lim f(x)= o. 


3. The theorem to be proved may now be formulated thus: 
Let G be a convex point set in R" and f(x) a function defined in G convex and 
semi-continuous from below and such that lim f(x) = @ for each boundary point 
2->=x* 


x* of G which does not belong toG. Then there exists one and only one point set T 
in R® and one and only one function $() defined in T with exactly the same pro- 
perties as G and f(x) such that 
(5) Zxt S f(x) + o(€), 
where to every interior point x of G there corresponds at least one point & of T for 
which equality holds. 

In the same way G, f(x) correspond to T, $(€). 

We define I as the set of all points — with the property that the function 
Zxt — f(x) is bounded from above in G, and we define ¢(£) in T as the least 
upper bound of this function: 


o(é) = Lub. (Zxt — f(x) . 


Then (5) is valid. The inequality 2xt — f(x)S z or 
f(x) = Uxt — 2 
means that the hyperplane y = Ext — z in R"*' with the normal vector &, —1 
lies nowhere above the hypersurface y = f(x), and — z is the intercept of this 
hyperplane on the y-axis. It is a well-known fact that there exists at least 
one hyperplane of support of the convex hypersurface, i.e. a hyperplane which 
contains at least one point of the hypersurface and lies nowhere above it. This 
shows that [ is not empty. Further we see that if there exists a hyperplane 
of support with the normal vector —, —1 and if x’, f(x*) is a point of contact, 
then we have 
o(t) = xt — f (x), 

and — ¢(¢) is the y-intercept of this hyperplane. If x® is an arbitrary 
interior point of G, a hyperplane of support through x’, f(x*) exists, and 
this proves the assertion on the equality sign in (5). 











76 W. FENCHEL ' 


It is evident that [ and ¢(£) are convex. In fact, let ¢’ and ” be arbitrary 
points of I’, then we have for xeG, 
Zxt’ — f(x) S o(t’), Zxt” — f(x) S ot”), 
hence Zx((1 — 0)&’ + 08) — f(x) S (1 — O)o(E’) + O9(E”) 
which shows that (1 — 6)#’ + 6” beiongs to f and that 
o((1 — 0)& + 08”) S (1 — 0)G(¢’) + 09(¢”). 
Let now £* be a boundary point of I and &I, xeG. Then it follows from (5) 
that lim $(€) 2 Ext* — f(x) 
E>e* 
and this shows on the one hand that £*eI if lim $(£) is finite, i.e. that T is closed 
relative to ¢(£), and on the other hand that 
lim () 2 o(¢*), 
&>t* 
i.e. that ¢(£) is semi-continuous from below. Hence I and @ have the same 
properties as G and f. 


4. It remains to be proved that if we start with I and ¢(&) the same pro- 
cedure gives G and f(x) again. We have to consider the set G* of all points x 
for which Léx — ¢(£) is bounded from above in I’, together with the function 

f*(x) = Lub. (zéx — o(€)) 


defined in G*. 

If xeG we get from (5) 
(6) Zex — o(8) < f(x) 
for all te’, hence G C G* and f*(x)s f(x) inG. But toan interior point x of G 
there corresponds a é such that equality is valid in(6), which implies f*(x) 2 f(x). 
Hence f*(x) = f(x) at the interior points of G and, as both functions are convex 
and semi-continuous from below, also at the boundary points of G. 

Let now x° be a point of R* not in G. We have to prove that it does not 
belong to G*, i.e. that 
(7) Lab. (Zéx? — ¢(t)) = o. 


Since the quantity Zix® — ¢(£) is the y-coordinate of the point at which the 
hyperplane 
y = Léx — $(E) 

of R** intersects the line x = x® parallel to the y-axis, we have to show that 
there are hyperplanes below the hypersurface y = f(x) which have arbitrary 
large intercepts on the line x = x°. Suppose first that x° is an exterior point 
of G. Then there exists a hyperplane H parallel to the y-axis which separates 
the line x = x*® from G and y = f(x). Consider any hyperplane of support S 
of y = f(x). Let S turn around the intersection of H and S so that the part 
lying below y = f(x) moves downwards. Then the point at which S intersects 
the line x = x° moves upwards and tends to infinity. Suppose next that x° is 
a boundary point of G but not belonging to G. Then we have f(x) +@ for 
x -»x°*. Consider any segment belonging to G and having x° as one of its end 
points. Let x’ bea fixed point and x” a variable point of the segment between 


See 





oo 


s 








CONJUGATE CONVEX FUNCTIONS 77 


x’ and x* such that f(x’’)> f(x’). A plane of support through x”, f(x’’) then 
intersects the line x = x® at a point the y-coordinate of which is greater than 
f(x") and therefore tends to infinity if x’’»x*. This completes the proof of 
the theorem. 


5. In section 1 it has been asserted that the hypersurfaces y = f(x) and 
n = ¢(€) correspond to each other in the polarity with respect to 2y = Ex*. 
This is obviously true in the sense that each of the hypersurfaces is the envelope 
of the polar hyperplanes of the points of the other. For y = f(x) may be 
considered as the envelope of the hyperplanes 


y = Zxt — o(E), 
where &I is the parameter, and the poles of these hyperplanes are the points 
£, o(E). 


6. Suppose now that y = f(x) is strictly convex, i.e. each hyperplane of 
support contains only one point of y = f(x). Let further 7 = ¢(£) satisfy the 
same condition; for y = f(x) this means that there passes at most one hyper- 


plane of support through a point of y = f(x). Then f(x) has continuous 
derivatives* 


ao & 
f(x) = ax; 
and we have t; = fi(x). 


These relations establish a continuous one to one correspondence between the 
interior points of G and those of [. Solving them with respect to the x we get 


xi = oi(€) 
where, for reasons of symmetry, the ¢; must be the derivatives of ¢. From this 
it is seen that in the case of m = 1 the derivatives of two conjugate convex 
functions are mutually inverse functions. This proves the assertion of sec- 


tion 1 on the inequality of Young. Furthermore we get an explicit expression 
for $(&) if f(x) is given, viz. 


o(t) = z E@i(t) — f (o:(€)) 
valid in the interior of [. Hence, our correspondence between f and ¢ is the 


Legendre transformation of the theory of differential equations. 


REFERENCES 


{1] T. Bonnesen, W. Fenchel, Theorie der konvexen Kérper (Berlin, 1934). 
(2] G. H. Hardy, J. E. Littlewood, G. Pélya, Inequalities (Cambridge, 1934). 


[3] S. Mandelbrojt, ‘‘Sur les fonctions convexes,” C.R. Acad. Sci. Paris, vol. 209 (1939), 
977-978. 


The Technical University of Denmark 
Copenhagen 


‘See [1] p. 23, 26. The argument used there in the case of positively homogeneous convex 


‘functions may easily be generalized to the case considered here. 











ON THE CRITICAL LATTICES OF 
ARBITRARY POINT SETS 


K. MAHLER 


(Dedicated to J. G. van der Corput) 


In this note, I shall establish necessary and sufficient conditions for the 
existence of critical lattices of an arbitrary point set, and I shall construct a 
non-trivial example of a point set without any critical lattice. In a previous 
paper,' I proved that every star body of the finite type possesses at least one 
critical lattice. 


I. 


1. Let S be any point set in n-dimensional Euclidean space R,. A 

lattice A is called S-admissible if no point of A, except possibly the origin 

O = (0,0,...,90), 
is an inner point of S. Such admissible lattices need not exist, e.g. if S is 
the whole space R,; we say in this case that S is of the infinite type, and put 

A(S) = @. 

If there are admissible lattices, S is called of the finite type. We then form the 
lower bound 

A(S) = Lb. d(A) 
of the determinants d(A) of all S-admissible lattices, and call this the minimum 
determinant of S. In the special case that 


A(S) = 0, 
there exist S-admissible lattices of arbitrarily small determinant, and S is 
called of the zero type; e.g. the null set has this property. 


2. A lattice A is called a critical lattice of S if 
(a) A is S-admissible, and 
(6) d(A) = A(S). 
It is clear from the definitions just given that S cannot have a critical lattice 
if it is of the infinite or the zero types. For there are no S-admissible lattices 
in the first case; in the second case, the lower bound is not attained since 
every lattice is of positive determinant. 
In the remaining case, when 
(1) 0 < A(S) < @, 
the following criterion holds. 


Received April 30, 1948. 
“On lattice points in n-dimensional star bodies, I,’’ Proc. Royal Soc., A, 187 (1946), 
151-187. The letters LP will be used to mark references to this paper. 


78 





—— 





aoe ae 





a 











CRITICAL LATTICES 79 


THEOREM 1. Let S be a point set in R, satisfying (1). Then S possesses 
at least one critical lattice, if and only if there exists a bounded* infinite sequence 
of S-admisstble lattices 

Ai, Ao, As, eos 
such that 


(2) lim d(A,) = A(S). 
r>o 
Proof. (i) If there exists a critical lattice A of S, then the infinite sequence 
of lattices 
ie 
has the required properties. 
(ii) Assume that 
Ay, As, As,... 
is a bounded infinite sequence of S-admissible lattices satisfying (2). We may 
then select* an infinite subsequence 
Ay,, Ar,, Av,,--- (i<ts<rs<...) 
tending to a limit, the lattice A say. By the continuity of the determinant, 
d(A) = lim d(A,,) = A(S). 
k->o 


The assertion is therefore proved if we can show that A is S-admissible, 
hence critical. If A were not S-admissible, there would be a point P # O 
of A which is an inner point of S. There exists then a neighbourhood of P 
consisting only of inner points of S. Since the lattices A,, tend to A, this 
neighbourhood contains a point of A,, for all sufficiently large indices k, 
contrary to the assumption that A,, is S-admissible. 


3. Two special cases of Theorem 1 are of particular interest. 
THEOREM 2. If the point set S is of the finite type, and if O is an inner point 
of S, then S possesses at least one critical lattice. 
Proof. Choose an arbitrary infinite sequence of S-admissible lattices 
Ra, Gy Bas + 


satisfying (2). Then this sequence is bounded since none of its points lie 
in a sufficiently small neighbourhood of O. The assertion follows therefore 
immediately from Theorem 1. 


THEOREM 3. If the point set S is bounded and not of the zero type, then 
it possesses at least one critical lattice. 
Proof. Let the assertion be false, i.e. assume that S has no critical lattice. 


Denote by « an arbitrarily small positive number, and by p so large a positive 
number that S is contained in the sphere 


|X| <p. 


2A sequence of lattices Ai, Ao, As, .. . is said to be bounded if (i) the determinants d(A,) 
are bounded, and (ii) no point P + O of these lattices lies in a certain neighbourhood of O. 
(LP, Definition 1, p. 155.) 

‘It is possible to select from any bounded sequence of lattices a subsequence tending 
to a limiting lattice. (LP, Theorem 2, p. 156.) 











80 K. MAHLER 


Choose further any infinite sequence of S-admissible lattices 

Ay, As, As,... 
satisfying (2). By Theorem 1, this sequence cannot be bounded. Hence 
there is an index k such that A, contains a point P; # O at a distance less 
than «from O. There is no loss of generality in assuming that P, is of the form 


P, = (&, 0,..., 0), where 0 < & <e, 

since the coordinate system may be so selected that the x,-axis passes through 
P,. Let now P2, P;, ..., P, be the points 
P, = (0, p, 0, ...,0), Ps = (0, 0, p, ...,0),..., Pa = ©, 0,0, ..., 9), 
and let A be the lattice of basis Pi, P2, ..., P,, hence of determinant 
(3) d(A) = fp"? < ep™™. 
Then this lattice is S-admissible. For A consists of the points 
P = uP, + weP2 + ... + UnPs (ea, Me, . 0+» Ma @ EO F 1, FB...) 
Of these lattice points, those with 

> u2> 0 

h=2 
lie at a distance not less than p from 0, hence do not belong to S. If, however, 

% £0,% = Us =...= 4, = 0, 


then P belongs to A, and so cannot be an inner point of S. 

Hence 

A(S) < d(A) < ep", 
whence 
A(S) = 0 

since « may be arbitrarily small. Therefore S is of the zero type, contrary 
to hypothesis. 

Theorem 2 contains as a special case my earlier result on the critical lattices 
of a star body of the finite type.‘ 


II 
4. The question arises whether Theorem 1 has a non-trivial content, 
thus whether there do in fact exist point sets satisfying the condition (1), 
but having no critical lattices. We shall now answer this problem by 
constructing an example of such a point set. But it will first be necessary 
to prove a number of simple lemmas. 


5. Let 
@j, Gg, @3,-.- 
be an infinite sequence of positive numbers satisfying 
a <az<az<..., lime, =O, 
r>@o 


‘LP, Theorem 8, p. 159. 














_ - ———— 








CRITICAL LATTICES 81 


and such that 


ar 
2.8 irrational if r ¥ s. 


Denote by > the set of all products 
uar, where r, u = 1,2,3,.... 
Then all elements of ¥ are positive; no two elements of 5 are equal; and any 
finite interval contains at most a finite number of elements of ©. Hence 
if the elements of &, 
£1, &2, &,... say, 
are arranged according to increasing size, 
ee & BS Be Kisco a 
then 
lim § =©. 
uo 
If ¢ is any positive number, and if £,, £, run over all pairs of elements of } 
for which 
f, * & (i.e. 4 #v),& St, 
then at most a finite number of the differences 
| g. in é, | 
are less than an arbitrary given constant. Denote by 
p(t) = min (| &, — & |) 
the smallest of these differences; it clearly defines a positive and non- 
increasing function of ¢. 
Moreover, 





lim p(t) =0. 
‘>a 


For © contains the elements, 





tay, Vas (u,o = 1,2,3,...), 
and, as is well known, there are positive integers u, v, for which 
| wa, — vas 
is arbitrarily small. 
6. From the definition of p(t), 
(4) | & — &| > max (p(€,), e(,)), if p # v. 





This implies that for no real number x both 


|x —&| <4o(E,) and |x — &| < 40(é), 
unless » = v. For if, e.g. uw < v, then from these inequalities, 
\¢, —&| =| (@ —&) — @ —&)| < 40.) + Blk) < FolE,) < v(,), 
contrary to (4). 
Lemma 1. Let K be the set of all real numbers x satisfying at least one of the 
inequalities 





|x —&,| < $o(2E,) (» = 1, 2,3,...). 








82 K. MAHLER 


If all multiples 
2*x (k = 0, 1, 2,3,...) 
of x belong to K, then x is an element of >. 
Proof. From the hypothesis, 

| 2% — & | <to(2é,,) (k = 0,1, 2,3,...), 
where the indices 4, depend on k. Therefore, in particular, 

| 2+ — 26,,| < $0(2¢,,), 
since p(t) is a non-increasing function of ¢. But if &, belongs to }, so does 
2¢,; hence these inequalities imply that 


Eu, = 2, (k = 0,1,2,3,...), 


whence 
&, _ Fan: | 2*(x — £0) | < $o(2**¢,,) (k wa 0, 1, 2, 3, ati -)- 
On letting & tend to infinity, the right-hand side tends to zero, and we find that 


x = §. 
as asserted. 


7. We need also the following, rather simpler, result. 


LEMMA 2. Let B be a positive number, and let K’ be the set of all real numbers 
x satisfying at least one of the inequalities 


|x — up| < © (u = 1,2,3,...). 
If all multiples 
2*x (k = 0, 1,2, 3, ...) 
belong to K’, then x is a positive integral multiple of 8B. 
Proof. By hypothesis, 
2% — ms| <5 (k = 0,1, 2,3,...) 
with integers u, depending on k. Therefore, in particular, 
| gkt+1y — 2u,6 | < 4 
B 


| 2* tx — wn418 | < 


6 
whence 


| (try — 2ux)B | ve | (2**1x — 2us8) — (2*tx — e418) | < : + : 5 ; 
and therefore 
| teyi— 2un| <4, weyr= Quy, wy = 2*uo (k = 
since the u’s are integers. Hence, 


| 
i) 
> 
N 
2 
Y 


| 2% — wep) | <= (k 


ie 


ll 
= 
— 
bad 
» 





Or 


as 


tv 


b 


Vv 


— ~~ —-— 





CRITICAL LATTICES 83 


On allowing & to tend to infinity, we find that 


x = Uf, 
as asserted. 


8. From the last two lemmas, we deduce a similar result for a special point 
set in n-dimensional space R,. 


Denote by 
@1, @2, @3,... and yj, Be, Bs,... 
two infinite sequences of positive numbers satisfying the following conditions: 
(I) eS Ce 2 ere lima, = @, 
ro 
em Te Ga? ones ims, =0, 
r>@ 
a;8; > aexBe > ax83 >..., lim a,8, = 1. 
r>o 
(II) If 
Yis Y2r-++5 Yr 
is any finite system of integers not all zero, then® 
ar¥1 + aey2 +... t+ aryr 0. 
Let further ™, u,..., #, and r run over all positive integers, and denote 
by 
TI") (4) = TI") (145, tg, ... , thn) 
the parallelepiped of all points 
X = (x, Xe, ..-» Xn) 
which satisfy the inequalities 
| xy — apts | <$p(2a,-m), | x2 — Bete | < = lx,n — us| <% (kh =3,4,...,n); 


here p(¢) is the function defined in 5. The centre of II‘ (u) is at the point, 
P (u) = P') (4, tg, ..., tn) = (a,tt;, Brtle, U3, ..., Un). 
Denote then by 
n = U n(x) 


the sum set of all parallelepipeds II‘ (u), and by 
P = { P\”)(u)} 
the set of all points P‘”(u). Since, from (4), 
p(2é,) < &, 


because both & and 2£, belong to 2, the two point sets II and P lie completely 
in the octant 


a2enmae @.-..Ho @ 


'The conditions (I) and (II) are satisfied if, e.g. 


a, = (1++)e, 6, = (1 +e" (r = 1, 2,3,...), 


as is trivial for (I), and follows for (II) from the transcendency of e. 











84 K. MAHLER 


Lemma 3. Let the point X = (x1, %2, ..., Xn) be such that all multiples 
2*X = (2*x,, 2*xe,..., 2*x2) (k = 0,1,2,3,...) 
belong to Il. Then X is an element of P. 
Proof. The first coordinate x, of X lies in one of the intervals 


| x1 — & | < $o(2E,) (» = 1,2, 3,...); 
the second coordinate x2 lies in one of the intervals 
| #2 — Brtne| < (7, we = 1, 2,3, ...)3 
and the remaining coordinates x, (h = 3, 4, ..., m) lie in intervals 
lan — ua| <3 (wu, = 1, 2,3,...); 


moreover, analogous conditions are also satisfied by the coordinates of the 
points 

2*xX (k = 1, 2,3, ...). 
Therefore, by Lemma 1, x; belongs to 2, so that 
(5) x = arty 
for some pair of positive integers r and u,. The same index r occurs in the 
inequalities for the multiples 2"x. of x,; by Lemma 2 applied with 8 = @,, 
there is therefore a positive integer uw, such that 
(6) Xe = Byte. 
Finally, by the same lemma applied with 8 = 1, there exist »—2 positive 
integers 3, Us,..., &#, such that 
(7) xh = Un (hk = 3,4,...,). 
The assertion is contained in (5), (6), and (7). 


9. We also need the following simple lemma about the bases of a lattice. 
LemMA 4. For every lattice A, a basis 


VY: = (yu, Via, - - - » Vin)» Vo = (ary Ye2,--- > Vandy--- + Vn = (Ynty Ynd,-- +» Van) 
can be found such that 
(8) yn > 1 (h,k = 1,2,...,m). 
Proof. First choose an arbitrary point Y; = (yu, yw, ..., Yin) of A with 
fa > 1, Fa 1,.-es Ra” I 


such that no inner point of the line segment joining O to Y; belongs to A. 
By Minkowski’s selection method®, »—1 further lattice points Y’s, Y’;,..., Y's 
can be chosen such that the » points 


Pa em cues Wa 
form a basis of A. Then the further » points 
Yi, Yo = Y'2+mV¥i, Ys = Y's +03¥%1,..., Yn = V'n + oaks 
where v2, 03, ..., Un, are n—1 arbitrary integers, also form a basis of A. We 


satisfy now the conditions (8) by taking the v’s positive and sufficiently large. 





*Geometrie der Zahlen, § 46. 

















CRITICAL LATTICES 85 


10. As in 8, we let m, tw, ..., %, and r run over all positive integers, 
but denote now by 
To’ (4) = Io" (143, te, ... , thn) 
the open parallelepiped of all points X inate 


[x1 — arty | < Bo(Bartn), | x2 — Bete | <5 »| xn — ua|< 4 (h = 3,4,...,%); 


its centre is again at the point P‘”(w), oil its closure is II‘ (x). 
Further denote by 
Mo = U to” (u) 
u.f 


the sum of all parallelepipeds Ip‘” (u), and by @ the point set 
ia Se, ie oe Boos Sie te 
The difference set 
S=2 — Il 
of all points of 2 which are not in Ilo, is evidently closed, since Ilo, as a sum 
of open sets, is open, and since @ is closed because R, does not contain a 
point at infinity. 

There are at most a finite number of points P‘”)(u) in every finite portion 
of 2. Therefore every point of S is either an inner point of S, or a boundary 
point of Q, or it is a boundary point of one of the closed parallelepipeds II” (x), 
hence belongs to II. 

11. Let now A be any S-admissible lattice. Then choose a basis ¥;, Yo, 

, Y, of A satisfying the condition (8) of Lemma 4. These n points, and 
also the vector sum 

Y= ¥,+ ¥2+...+ Ya 
are not inner points of S, nor are they boundary points of 2; and the same is 
true even for the multiples 
(9) 2*Y,,2*Y2,...,2*° Vn, 2°(V¥i + Yet+...+Y,) = 2° Y (k = 0,1,2,3,...). 
Hence all points (9) belong to Il. Sut then, by Lemma 3, the m + 1 points 
Sarre 


are elements of P, and so there exist positive integers 


Tan Tan 0 0 0 gp Van F 
and 
Unk, Uy . San Bass cate 
such that 
Y,= (ar Mar, Br Une, Uns, «++ 5 Uhn) ‘oe See 


Y = ¥, + Y¥e+...+ Va = (arts; , Brttz, s,..-, tn) 
Therefore, in particular, 
Grit + Artin +... + Or nr = rth. 
By the hypothesis (11) of 8, this equation can hold only if 


= fot... =e Mn =, Oy tant... $+ ta = %. 











86 K. MAHLER 


Hence all basis points Y, belong to the same value of r, and the basis is of 
the form 


Ys - (a-Ur, Btn, Uns, - 225 tin) (h = 1, 4 eees n). 


Denote now by A, the lattice of all points 
P = (argi, Brf2, £3, +--+» Zn): 
where the g’s run over all integers; this lattice is of determinant 
d(Ar) = arB,. 


Since the basis elements Y, of A belong to A,, A is either identical with A, 
or it is a sublattice. In either case, 


d(A) = gd(A,), 
where g is a positive integer. Hence, by the hypothesis (1) of 8, 
d(A) >d(A,)>1, and d(A) > 2ifg> 1. 
In the other direction, from the same hypothesis, 


lim d(A,) = 1. 
ro 


We find therefore the following result: 


THEOREM 4. The only admissible lattices of the set S are (i) the lattices 
Ay, As, As, ..., and (ii) their sublattices. All S-admissible lattices are of 
determinant greater than 1, but 

lim d(A,) = 1. 
r>@o 


Hence A(S) = 1, and there are no critical lattices of S. 


12. Theorem 4 implies, in particular, that S has only an enumerable set 
of admissible lattices, a possibility which cannot arise for star bodies. It 
is further clear that no point of any S-admissible lattice lies on the boundary 
of S. 

The following, somewhat simpler, example of a point set is possibly even 
more surprising. Denote by T the set of all points X such that 


max( |x; — |, | x2 — ue|,...,| an — tal) > 3 


for every system of integers ™, m2, ..., %,. It is not difficult to deduce from 
Lemma 2, that the only 7-admissible lattices are (i) the lattice of all points 
with integral coordinates, and (ii) all its sublattices. Therefore A(T) = 1, 
and there is just one critical lattice. Every point of this critical lattice lies 
at a distance $ from the boundary of 7, and the same is true for the points 
of the T-admissible lattices. This is very different from the position for 





Ee 


* 








CRITICAL LATTICES 87 


star bodies; for every critical lattice of a star body has at least one point 
arbitrarily near to its boundary. 


Postscript (June 1948) 


Mr. C. A. Rogers, having been told of my result, found the following 
simpler example of a point set without a critical lattice: 


X\Xe 
m1>0, m>Q0, wm (1— —#) <1. 
xy" + x,” 
This two-dimensional set differs from my example in having a continuous 
infinity of admissible lattices. 


Unwersity of Manchester 











THE NONEXISTENCE OF CERTAIN FINITE 
PROJECTIVE PLANES 


R. H. BRUCK and H. J. RYSER 


1. Introduction. A projective plane geometry 7 is a mathematical 
system composed of undefined elements called points and undefined sets 
of points (at least two in number) called lines, subject to the following three 
postulates: 

(P;) Two distinct points are contained in a unique line. 

(P:) Two distinct lines contain a unique common point. 

(P;) Each line contains at least three points. 

The projective plane z is finite if it consists of a finite number of points. 
If x is finite, then there exists a positive integer N such that each line of 
=x contains exactly N + 1 distinct points, and each point is contained in 
exactly N + 1 distinct lines. Moreover, x has exactly N*? + N + 1 distinct 
points and N* + N + 1 distinct lines (see [3], [6], [13]). 

In all known finite geometries the integer N is a power of a prime. Indeed, 
for every prime p and for every positive integer mn, finite geometries with 
N = p” have been constructed by means of the Galois fields GF[p"] (see [12)). 
It is still an unsettled question whether or not N must be the power of a 
prime. In this connection it has been shown that there does not exist a 
finite geometry for N = 6 (see [11]). The purpose of our paper is to prove the 
following more ge’.eral theorem on the non-existence of finite geometries. 

THEOREM 1. if N = 1 or 2 mod 4 and if the square free part of N contains 
at least one prime factor of the form 4k + 3, then there does not exist a finite 
projective plane geometry with N + 1 points on a line. 

In section 2 finite geometries are studied in connection with matrices whose 
elements are non-negative integers. The Minkowski-Hasse theory on the 
equivalence of quadratic forms under rational transformations is discussed 
in section 3, and the results of sections 2 and 3 are then utilized in section 4 
to prove Theorem 1. 

It is to be noted that Theorem 1 asserts in particular that a geometry 
does not exist for N = 2, where is a prime of the form 4k + 3. Moreover, 
a finite plane with NV + 1 points on a line can always be constructed from a 
given complete set of mutually orthogonal Latin squares of order N 2 3 (see 
[1], [8]). Thus for any N of Theorem 1 there does not exist a complete set of 
mutually orthogonal Latin squares of order N. 


2. The Incidence Matrix. An n-rowed square matrix A each of whose 
elements is zero or one is an incidence matrix provided it satisfies the following 
three conditions: 


Received May 7, 1948. 




















FINITE PROJECTIVE PLANES 89 


(1,) If 7, and r2 are two distinct rows of A, then there is a unique integer j 
such that the rows r; and r, each have the integer one in the jth column. 

(I,) If c and c are two distinct columns of A, then there is a unique 
integer + such that the columns c and ¢ each have the integer one in the 
ith row. 

(I;) Each row of A contains at least three ones. 

THEOREM 2. If & is a finite projective plane geometry with N + 1 points 
on a line, then there exists an incidence matrix A of order n = N* + N + 1. 
If A*™ denotes the transpose of the matrix A, then 
(M) B = AA’ = A‘A, 
where B is an integral matrix with N + 1 down the main diagonal and ones 
in all other positions. 

For let the N? + N + 1 points of « be numbered in any convenient order 
1, 2,..., N*? + N+ 1 and listed in a row. Let the N*? + N + 1 lines be 
numbered similarly 1, 2, ..., N? + N+ 1 and listed in a column. Then 
let a table of N*? + N +1 rows and N* + N +1 columns be formed by 
inserting a one in row 7 and column j if line « contains point 7, and a zero 
in the contrary case. Then by the properties of the geometry 7 given in 
section 1, it follows that the table yields an incidence matrix A which satisfies 
the equation (M). 

THEOREM 3. If a matrix A with non-negative integral elements and of 
order n > 1 satisfies the equation (M), where N 2 2, then A is an incidence 
matrix and defines a finite projective plane geometry with N + 1 points on a 
line. 

The matrix A must be composed entirely of zeros and ones. For if aj; 
were an element of A in row i and column j and if a;; were greater than one, 
then by equation (M) each element in column j of A except a;; would be zero. 
Moreover, each element in row i of A except a,; would also be zero. But 
then the matrix AA‘ would contain a zero element, and this is impossible 
if A is to satisfy (M). Since A is composed of zeros and ones and since A 
satisfies (M) with N 2 2, it follows that A is an incidence matrix, and this 
incidence matrix can be used to define the finite projective plane. 


3. Congruence of Matrices. Let A and B be two symmetric matrices 
of order » with elements in the rational field. The matrices A and B are 
congruent, written A ~ B, provided there exists a non-singular matrix C 
with rational elements such that 


A = C'BC. 
It is easy to show that congruence of matrices satisfies the usual requirements 
of an equals relationship. 
Suppose now that A is an integral symmetric matrix of order and rank n. 


It is well known that one can always construct an integral diagonal matrix 
D = |d, de, ..., dal, where d; 0 for i = 1, 2, ..., m, such that D~ A. 











90 R. H. BRUCK AND H. J. RYSER 


The number of negative terms « in this diagonal is called the index of A. 
Sylvester's law of inertia states that « is an invariant of A (see [7]). 

Let d = (— 1)‘6, where 4 is the square free positive part of the determin- 
ant |A| of the matrix A. From the matric equation B = C™AC, it follows 
that |B| = |C\*|A|. Hence d is a second invariant of A. 

Minkowski [9] and Hasse [4] have introduced a third invariant c,, which 
with the preceding two completes the system. Before discussing the invariant 
Cp», we recall now the essentials of the Hilbert norm-residue symbol (m,n) ». 
The norm-residue symbol is defined for arbitrary non-zero integers m and n 
and for every prime p. Its precise definition as well as complete proofs of 
the following two theorems can be found in the collected works of Hilbert [5]. 

THEOREM 4. If m and n are integers not divisible by the odd prime p, then 


(1) (m,n), = +1, 

(2) (n, bP)» = (p, m)» = (np), 

where (n|\p) is the Legendre symbol. Moreover, if n = m #0 mod ?, then 
(3) (m, p) > = (n, Pp)». 


THEOREM 5. For arbitrary non-zero integers m, m', n, n’ and for every 
prime p, 


(4) (— n,n)p = +1, 
(5) (m,n) > = (n,m) », 
(6) (mm',n)» = (m,n) (m’, n) », 
(7) (n, mm’), = (n, m) »(n,m’)p. 


At this point it is convenient to prove a Lemma which is useful for the 
proof of Theorem 1 in section 4. 


LemMa. For p an odd prime and for every positive integer n, 


(8) (n,n +1), =(— 1," + 1),, 
(9) (n,n? +n+1),= +1, 
(10) Il G,i+1), =((+1)!,—-1),. 


i=1 

If p does not divide m or nm + 1, then (8) is trivial. If p divides nm, then 
n+1=1 mod p and if p divides » + 1, then n = — 1 mod p. By (3) 
of Theorem 4 equation (8) is established. If p divides n, then n?+2+1= 
(n + 1)? #0 mod p and if p divides n*?+n+1, thenn = (n+ 1)? #0 
mod p. This establishes (9). Equation (10) is a consequence of (8) and 
Theorem 5. 

Now let A be a non-singular and symmetric integral matrix of order n. 
Let D, denote the leading principal minor determinant of order r, and 
. suppose that D, ~ 0 forr = 1,2,...,m. The invariant c, is then defined 
for every odd prime p by the equation 


n—l 


Cp = Cy(A) = (- 1, — Dn)» IL (D;, — Dj+1), - 
i=1 


By (1) of Theorem 4, evidently c, = — 1 for only a finite number of p. 











FINITE PROJECTIVE PLANES 91 


We are now in a position to state the fundamental Minkowski-Hasse 
theorem, a proof of which can be found in the original paper of Hasse [4]. 
More recent developments of the theory are discussed in [2] and [10]. 

THeoreM 6. Let A and B be two integral symmetric matrices of order and 
rank n. Suppose further that the leading principal minor determinants of A 
and B are different from zero. Then A ~ B if and only if A and B have the 
same invariants 1, d, and Cc, for every odd prime p. 


4. Proof of Theorem 1. Let WN be a positive integer and let B, denote 
the integral matrix of order » with N + 1 down the main diagonal and ones 
in all other positions. If we subtract column one of B, from each of the other 
columns, and then add to row one each of the other rows, we obtain 

|B,| = N"-(N + n). 
In particular if n = N* + N + 1, then B, is the matrix B of equation (M) 
and |B\ is the square of an integer. 

If row n of B, is subtracted from each of the other rows, and if column n 
is then subtracted from each of the other columns, the resulting matrix is 


rT 86oN N N eA —-N) 
N 2N N ois —wN 
N N 2N rene —wN 
0, = ’ 
| —-wN -—wN —N ... N+14] 








and this matrix is congruent to B,. Hence for every odd prime p, c,(B,) = 
Cp(Q.). Moreover, if E; denotes the determinant of order i with 2:V down 
the main diagonal and N in all other positions, then E; = N‘(i +1). Thus 
if » = N? + N +1 and if p is an odd prime, then the invariant c,(B) = 
c,(Q,) of the matrix B of equation (M) is given by 


a-—2 





¢o(B) = (Ens, — 1)y Il (Es, — Eis). 
In the subsequent computation we prove 
N(N +1) 
(E) cp(B)=(—1,N)p * 


By Theorem 5 and (10), and omitting for convenience the subscript 9, 
n—2 


Tl (£;, — En) = Tl WG +), — N'*G + 2)) 
i=1 


i=1 
= [Il (Nv, — Ni) G41, -@4+2))S 
i=1 
(n — 1)(n — 2) 


= (N, — 1) . ((m —1)!,—1) @!,-1)S, 











92 R. H. BRUCK AND H. Jj. RYSER 


where 


an—2 
S = II (N*,i + 2) (NW**,4 4+ 1). 
i=1 
Moreover, by (9) 


n—2 n—3 


S = II (N‘,i+2) I (Wii 4+ 2) 
‘=1 i=0O 


= (N,n)*"* = + 1. 


Thus 
(a — 1)(" — 2) 
cp(B) = (N"™""n,-—1)(N,-1) * (m,-1) 
(nm — 1)(" — 2) N(N +1) 
=(N,-—1)""(N,-1) 7? =(N,-—1) * , 


and this establishes equation (E). 

Suppose now that = is a finite projective plane with NV + 1 points on a line. 
Then by equation (M) of section 2, the matrix B is congruent to the identity 
matrix J. Since c,(J) = +1 for every odd prime ?, it follows that if x 
exists, then for every odd prime ?, 





N(N +1) 
c(B)=(-—1,N) * =+1. 
N(N + 1), 
If now N = 1 or 2 mod 4, then the exponent a —is odd. Moreover, 
if a prime p of the form 4k + 3 divides the square free part of N, then 
(—1,N), = -—1. This is a contradiction and completes the proof of 


Theorem 1. 


Postscript (November 13, 1948) 


(a) In a letter to one of the authors, dated May 11, 1948, Marshall Hall 
pointed out that the n-rowed symmetric matrix B of section 4 (n = N? + N + 1) 
is the matrix of a quadratic form which can be written as 


x bd x 2 

(x2 +... +24)? + v(» + ,) +...4 v(x, + 3) 

Hall’s remark demonstrates concretely that B is rationally congruent to 

the diagonal matrix D = (1, N, N,...,N) and thus permits a simpler deri- 
vation of equation (E). 

(6) In 1782 Euler conjectured that a pair of orthogonal latin squares (or 

a graeco-latin square) of order N cannot exist if N has the form4k +2. The 

truth of Euler’s conjecture would ensure (see [1], [8]) the non-existence of pro- 

jective planes with N =2 mod 4 and hence would both imply and improve 

















FINITE PROJECTIVE PLANES 93 


one half of Theorem 1. For this reason the authors have decided to add to 
the bibliography a paper by H. F. MacNeish [14] containing a “proof’’ of 
Euler’s conjecture. The correctness of this proof, however, has been questioned 
by F. W. Levi. In this connection see [6] (Second Lecture); Jahrbuch der 
Math., vol. 48 (1921), 71; Jahrbuch der Math., vol. 49 (1923), 41-42. 


REFERENCES 


[1] R. C. Bose, “On the application of the properties of Galois fields to the problem 
of construction of why, heed p= squares,” Sankhya, Indian Journal of Statistics, vol. 3 
(1938), 323-338. 

[2) W. H. Durfee, “Quadratic forms over fields with a valuation,”’ Bull. Amer. Math. 
Soc., vol. 54 (1948), 338-351. 

13] M. Hall, “ Projective planes,” Trans. Amer. Math. Soc., vol. 54 (1943), 229-277. 

{[4] H. Hasse, “Ober die Aquivalenz ee Formen im Ké6rper der rationalen 
Zahlen,” J. reine angew. Math., vol. 152 (1923), 205-224. 

[5] D. Hilbert, Gesammelte Abhandlungen, I (Berlin, 1932), 161-173. 

(6) F. W. Levi, Finite geometrical systems (University of Calcutta, 1942). 

{7' C. C. MacDuffee, The theory of matrices (New York, 1946), 56. 

{; H. B. Mann, “On orthogonal Latin squares,” Bull. Amer. Math. Soc., vol. 50 
(1944), 249-257. 

9} H. Minkowski, Gesammelte Abhandlungen, 1 (Leipzig and Berlin, 1911), 219-239. 

{10} G. Pall, mL arithmetical invariants of quadratic forms,’ Bull. Amer. Math. Soc., 
vol. 51 (1945), 185-197 

{11] G. Tarry, “Le probléme de 36 officiers,” Compte Rendu de l'Association Frangaise 
pour I’ Avancement de Science Naturel, vol. 1 (1900), 122-123, vol. 2 (1901), 170-203. 

{12} O. Veblen and W. H. Bussey, ‘ ‘Finite projective geometries,” Trans. Amer. Math. 
Soc., vol. 7 (1906), 241-259. 

{13] O. Veblen and J. H. M. Wedderburn, “‘Non-Desarguesian and non-Pascalian 
geometries,” Trans. Amer. Math. Soc., vol. 8 (1907), 379-388. 

[14] H. F. MacNeish, “Euler squares,”” Ann. of Math., vol. 23 (1921-22), 221-227. 


The University of Wisconsin 











GENERALIZED VECTOR SPACES. I. 


THE STRUCTURE OF FINITE-DIMENSIONAL SPACES 


KARL MENGER 


1. INTRODUCTION 


During the last fifty years, the concept of the Euclidean space (an n- 
dimensional coordinate space with a Pythagorean distance) has undergone 
various profound generalizations. 

Hilbert introduced the infinitely-dimensional Euclidean space whose 
points are infinite sequences of coordinates having from the origin, and thus 
from each other, finite Pythagorean distances. 

Minkowski generalized the Pythagorean distance. Any surface which is 
symmetric about the origin, 0, and intersects every ray issuing from o in 
exactly one point, is admitted as the “‘unit sphere’’ about 0, that is, as the 
set of all points having the distance 1 from 0. The distance from o to a point 
whose coordinates are k times those of a point on the unit sphere, is k. 
Minkowski chose a congruent unit sphere about every point. He discovered 
the equivalence of the convexity of these spheres and the triangle inequality 
for the distance. Finsler introduced spaces which are locally Minkowskian 
in the same sense in which Riemann spaces are locally Euclidean. With each 
point, a “tangential” Minkowskian space is associated—the unit sphere 
varying from point to point. Finsler found that each positively definite 
problem of the Calculus of Variations gives rise to one of his spaces. 

In the finite-dimensional case, Weyl noticed that the definition of points 
by coordinates could be replaced by the assumption that undefined points 
can be added, and multiplied by real numbers. Banach, Hahn, and Wiener 
[1], independently of each other, introduced the following concept. A set of 
elements, v, w, . . . (called vectors) is said to be a vector space if 

(a) the set is a commutative group, the operation being denoted by +, 

the neutral element by 0, so that v + 0=0+90= 0; 
(6) an associative and doubly distributive multiplication of vectors by 
real numbers, a, 8, ... is defined, that is to say, 
a(6v) = (a8)v, (a + 8)v = av + fv, a(v + w) = av + aw; 
for the multiplication by the numbers 1, —1, and 0 we have 
lv=v, —lv= —v, Ov=0; 
(c) with each vector, v, a real number | v| is associated, called the norm 
of v, which satisfies the following three conditions 
(1) | av| =|a|| | for every »; 
(2) |juo+w|<|o|+ ||; 
(3) ifv #0, then|v| > 0. 
Received October 1, 1948. 
94 


























GENERALIZED VECTOR SPACES 95 


Much earlier, Fréchet had introduced the most radical generalization of the 
Euclidean space by assuming only that a number (called distance) be 
associated with every unordered pair of elements of a set, identical elements 
having the distance 0, distinct elements a positive distance, while the distance 
satisfies the triangle inequality. As a price for the generality of these metric 
spaces we have to accept the possible absence of directions of any kind. 

In applying metric methods to the Calculus of Variations we made use of 
all these generalizations of the concept of space [2]. We studied the minima 
of line integrals even in a general metric space. Our integrand is a function 
of the point and (in absence of a direction) of an ordered pair of distinct 
points. Multiplying the distance by this function we obtain a new distance 
which we call the variational distance. If, in particular, the metric space is a 
vector space, and the function is positive and endowed with strong continuity 
properties, then one obtains a Finsler space. If the metric space is Euclidean, 
then our results generalized Tonelli’s existence theorems for the parametric 
case. 

Besides synthesizing the various known concepts, the metric ideas in the 
Calculus of Variations led to a generalization of the idea of space in a new 
direction. Minkowski spaces as well as the vector spaces of Banach, Hahn, and 
Wiener, and even Fréchet’s metric spaces, have the following two important 
features in common: distinct points have distances ~ 0; and distances are 
non-negative. But, on every level of generality, the only source of the lower 
semi-continuity of the line integral is the local triangle inequality of the 
variational distance. The two other traditional features (and still more, of 
course, the symmetry) of the distance appeared to be quite inessential. As 
u result, one can, in particular, generalize Finsler’s concept in such a way that 
one can associate a generalized Finsler space also with semi-definite and 
indefinite parametric variational problems in vector spaces. 

As a by-product, these studies yielded a generalization of the concept of a 
vector space. Alt [3] proved the equivalence of the triangle inequality with 
what he called “projective convexity’ of the unit sphere. Pauc [4] and 
Aronszajn continued this work in many interesting ways and the latter 
first explicitly formulated the concept of general vector spaces [5] which 
implicitly was contained in our remarks [6] about what we called ‘‘generalized 
Minkowskian metric.” In our spaces we had admitted that distinct points 
might have the distance 0, and that the distance of two points might be 
negative. In fact, we mentioned that vectors might have negative norms or 
the norm 0. We had dropped the symmetry of the distance and the norm. 
All we had retained was the triangle inequality for distances and norms, and 
the assumption that by multiplying a vector by a positive number, k, the 
norm was multiplied by k. 

Now we intend to study these generalized vector spaces in a series of papers. 
The present first paper contains a few remarks about all generalized vector 
spaces but essentially deals with the structure of generalized vector spaces 
of finite dimension. We prove that each such space is built up of a subspace 











96 KARL MENGER 


all of whose vectors (except 0) have a positive norm; a subspace all of whose 
vectors have the norm 0; and possibly one single line containing a vector 
with a non-positive norm while the norm of the opposite vector is positive. 

In subsequent papers we shall study spaces of infinitely many dimensions, 
metric properties of our spaces as well as topological aspects of the theory 
(“triangular topologies’), non-real multipliers, and applications which, 
besides the Calculus of Variations, comprise the theories of operators and of 
normed rings. 


2. THe MAIN Types or GENERALIZED VECTOR SPACES 


A generalized vector space is a set, V, of elements for which addition, and 
multiplication by real numbers, are defined according to Postulates (a) and 
(b) while the norm | v| of a vector is a real number satisfying only one and 
one half of the three Postulates (c), namely, 


(1*) If a > 0, then | av| = alo]. 
(2) lo+w| <lo|+|a|. 

We do not postulate the other half of (1), that is 

(17) If a < 0, then | av| = —al |, 


nor the two important properties of the ordinary vector spaces which are 
jointly postulated in (3), namely, that | v| > Oand that » ¥ o implies| v | +0. 

We shall briefly call a vector, v, positive, negative or null according to whether 
| v| is positive, negative or 0. We call the vector v degenerate if v * o and 
|v| =| -»| =0. 

We call a general vector space, V, 

definite if every vector, except 0, is positive; 

semi-definite if V is not definite but no vector is negative and at least 
one vector is positive; 

indefinite if V contains both positive and negative vectors; 

degenerate if V contains at least one degenerate vector; 

non-degenerate if V is not degenerate; 

totally degenerate if every vector of V is degenerate and thus null. 

The vector spaces of Banach, Hahn, and Wiener are the definite vector 
spaces. If by O* we mean the set containing only a vector o, then, according 
to the above terminology, O* is a definite vector space, and consequently 
non-degenerate. In fact, O* is a vector space in the ordinary sense. 

That we use the terms definite and semi-definite instead of positively 
definite and positively semi-definite will not lead to ambiguities since we shall 
see in Section 4 that no space is negatively definite or negatively semi-definite. 
There would be only negatively definite and negatively semi-definite spaces 
if we postulated 1*) in conjunction with the triangle contra-inequality 


lo+w|>lo|+| a. 

















GENERALIZED VECTOR SPACES 97 


3. VeECTOR-ALGEBRAIC PRELIMINARIES 


A subset, V’, of V is called a subspace if for every two vectors, » and w, 
of V’ and for every number a, the vectors v + w and av belong to V’. The 
set consisting of o alone is the subspace O*. 

If S is a subset of V, then we denote by V(S) the subspace of V consisting 
of all vectors aw, + aw. +... + a,v, where n is any integer, the a; are 
numbers, the v; vectors of S. In particular, if S consists of only one vector 
v # o, then V(S) is called the v-line and denoted by |v]. In the usual way, 
we mean by V’ + V”, the join of two subspaces V’ and V”, the subspace 
V(S) where S is the set of all vectors belonging to V’ and/or V”; by V’. V”, 
the intersection of V’ and V”, the subspace of all vectors belonging to both 
V’ and V”. If V’ # O* # V" and V’. V” = O*, then V’ and V” are called 
independent. 

Lemma. If V’ is a subspace of V, then there exists a subspace, V", of V 
such that V’'’ + V” = V and V’. V" = O*. 

If V’ = V, then V” = O*. If V’ # V, then there exists a vector, »;, which 
does not belong to V’. In this case, let 2 be any ordinal number about which 
we make the following assumption: with every ordinal number w <Q a 
vector, v,, has been associated in such a way that if S, is the set of all vectors 
Vi1,..., Uy, then 

(1) the set S, does not contain any finite subset of dependent vectors; 

(2) V(S,).V’ = O*. 

We call Ty the set of all vectors v, such that w <Q. Then two cases are 
possible. Either V = V(T,g) in which case we set V”’ = V(Tg) and our 
proposition holds. Or V contains vectors not belonging to the join V’ + 
V(Tg). In this case, we call one of these vectors vg, and denote the set of all 
vectors 0;,...,%, by Sg. Then we have associated a vector v, with every 
ordinal number w < Q in such a way that conditions (1) and (2) are satisfied. 
There exists an ordinal number © such that the first case prevails. If V is 
n-dimensional, this follows by induction, and Q<n. If V is infinitely 
dimensional, the conclusion is valid by transfinite induction. 

If o # o, then we call the set of the vectors av for all a > 0, the openv-ray 
or, briefly, since we shall not consider rays which include 0, the v-ray. We 
call the (—v)-ray the opposite ray. The v-line consists of 0, the v-ray, and the 
opposite ray. 

If » and w are independent vectors (that is, vectors ¥ o neither lying on 
the line of the other), then we call the set of the vectors av + 6w for all real 
numbers a, 8, the v, w-plane. We further call the set of the vectors av + Bw 
such that a > 0, 8 > 0 (a > 0, 8 > O) the closed (open) v, w-quadrant. We 
denote these quadrants by [v, w] and (v, w), respectively. We can also intro- 
duce the half-open quadrants [v, w) and (v, w]. 

The set of all vectors which are opposite to those of the open, the closed, 
the half-open first v, w-quadrants are called the open, the closed, the half- 
open third v, w-quadrants, respectively. We denote these sets by )v, w(, }»,w, 








98 KARL MENGER 


]v,w( , )v,wl, respectively. Clearly, the first and third quadrants are 
associated with the unordered vector pair, v, w. With the ordered pair 
v, w we can also associate the closed second v, w-quadrant, that is, the set]}»,w] 
of the vectors —av + 6w such that «20, 820. Similarly we define 
)v,w), ]v,w), )v,w], and the fourth v,w-quadrants. One readily proves 

REMARK 1. If v, w, x are pairwise independent vectors and x belongs to 
(v,w), then a vector y belongs to (v,w) if and only if y either belongs to (v,x) or to 
(x,w) or to the x-ray. 

REMARK 2. If v,w,x are pairwise independent and x belongs to )v,w(, then 
every vector of the v,w-plane belongs to (v,w) or to (v,x) or to (w,x) or to the rays 
of one of the vectors v,w,x. 


4. COROLLARIES OF THE ASSUMPTIONS ABOUT THE NORM 


We shall deduce immediate consequences of the assumptions (1+) and (2) 
about the norm in a generalized vector space. 


If in (1+) we set a = 2and »v = a, then since 20 = o we conclude 


(1°) |o| =0. 

If in (2) we set v = vo’ + v0” and w = vw”, we obtain 

lv | =|v +0" —o'| <|o +o" +] 0", 
thus 
jo’ + 0"| > |v | —| -0" |. 
Similarly, | v’ + v” | > |v” | —| —v’|. Hence 
(2a) Max([|v| —| —w|,|w|] —| —ol] <|o+w| <|o| +] oI. 
If in (2a) we set w = —», then by (1°) we have 0 <|0| +! —v| and thus 
lv} > —|-—v| and | —v| > —|ol}. 


In particular, we can formulate the following 

LemMMA. The opposite of a negative vector is positive. The opposite of a 
null vector is non-negative. 

As a corollary of this lemma we see that no space is negatively definite or 
negatively semi-definite. 

In absence of a general concept of limit, we can prove only two restricted 
continuity properties of the norm. 

ADDITIVE CONTINUITY. For every « > 0 there exists a 6 > 0, namely 6 = e«, 
such that for every vector v 

from | w\| <8 and | —w| <8 it follows that |\v| —8 <|v+w| <|v| +4. 
This is an immediate consequence of 2a). 

FINITE-DIMENSIONAL CONTINUITY. For every « >0 every inieger n, 


and every n-tuple of vectors wy, W2,..., Wn, there exists a 5 >0 (depending 
Upon €, Wi,...,Wn) such that from 
|51| <4, | 52| <8,...,|in| <8 


for every vector v it follows that 
lo| —e<|o+ dw, + dm. +... + d,0n| < lo] + 














GENERALIZED VECTOR SPACES 99 


Setting 5,w, +... + 5,w, = w we see that both | w| and | —w| are 
< nm Max | 6; |-Max[| wi| , | —w;| }. 


Thus 6 = = Max [ |w;| , |—w| ] satisfies the requirement. 


5. LEMMAS 


LemMA 1. Jf v and w are independent non-positive vectors, then every vector 
of [v, wl], .e., the closed first v, w-quadrant, is non-positive. 

For if a > 0 and 6 > 0, then 

| av + Bw| <|av| +| pw| =alv| +8|w}. 
The last expression is < 0 if |v| < 0 and | w| < 0. 

The last expression is < 0 if a > 0, 8 > O and at least one of the vectors » 
and w is negative. We thus have proved 

LemMA 2. If of two independent vectors, v and w, one is non-positive and the 
other negative, then every vector of (v,w), i.e. the open first v,w-quadrant, is 
negative. 

LemMaA 3. If v and w are independent null vectors, then either every vector of 
(v, w) is mull or every vector of (v,w) is negative. 

By Lemma 1, every vector of (v,w) is non-positive. Either every vector of 
(v, w) is null or there exists a negative vector, x, of (v,w). In the latter case, 
by Lemma 2, every vector of (v,x) and of (x,w) is negative. Since by Remark 1 
of Section 3, every vector of (v,w) belongs either to (v,x) or to (x,w) or to the 
x-ray, every vector of (v,w) is negative. 

LemMA 4. If v and ware independent and v is degenerate, then every vector 
of the open half-plane (v,w] + [w, — v) of the v, w-plane has the same sign as w. 

If w or any other vector of (v,w] + [w, — v) is negative, then by Lemma 2 
every vector in both quadrants is negative. If w is null, then by Lemma | 
every vector of |v,w] and every vector of [—v,w] is non-positive. By Lemma 3 
none of these vectors is negative. Similarly, if any vector w’ of the open 
half-plane is null, all vectors are null. Consequently, if w is positive, then 
every vector of the open half-plane is positive. 

An obvious consequence of Lemma 4 is the following 

CorROLLARY. Jf v and w are independent degenerate vectors, then every vector 
of the v, w-plane is degenerate. 


6. THe DEGENERATE PART OF GENERALIZED VECTOR SPACES 


The set, Va, of all degenerate vectors of the vector space V is a subspace of 
V which we shall call the degenerate part of V. For if v is a degenerate vector, 
then obviously av is degenerate for every a; and if v and w are degenerate, 
then by the Corollary of Lemma 4, the vector v + w is degenerate. 

THEOREM 1. Every generalized vector space, V, is the sum of a uniquely 
determined totally degenerate subspace, Va, and a non-degenerate subspace V’. 
The latter can be chosen in such a way that V’ and V«q are independent unless 
V is totally degenerate or non-degenerate. In the former case we have V’ = O*, 
in the latter case, Va = O*. 











100 KARL MENGER 


Let Va be the degenerate part of the vector space V. By the Lemma of 
Section 3, there exists a subspace V’ such that V = Vz + V’ and V,- V’ = O*. 
The subspace V’ is non-degenerate since V and V4 have only the vector o 
in common, and V4, contains all degenerate vectors. 


7. Tue Non-Positive Part oF A VECTOR SPACE 


We shall call a subset, C, of a vector space a cone if 
(a) C contains o and at least one vector besides 0; 
(6) for every vector, v, of C, except o, the v-ray is a subset of C. 
We shall call the cone convex if 
(c) for every two independent vectors, » and w, of C every vector of 
the first quadrant (v,w) belongs to C. 
We shall call the cone proper if 
(d) C does not contain two opposite vectors. 

We shall refer to proper convex cones briefly as cones. We shall call the 
cone C open if every vector of C, except 0, is an interior element of C. Here 
we define interior elements without reference to spherical neighbourhoods in 
the following way. The vector w is an interior element of the subset S of 
the vector space V if for every vector, v, of V there exists a positive number a 
(depending upon v) such that for every « between 0 and a the vector w + av 
belongs to S. We call a cone, C, closed if the set of all vectors not belonging to 
Cis open. By the boundary of an open cone, C, we mean the set of all vectors 
w which do not belong to C while for some vector, v, of the vector space and 
every sufficiently small positive number ¢ the vector w + e does belong to C. 

In terms of these concepts we can formulate the following 

THEOREM 2. In a non-degenerate vectorspace, V, which is not definite, the 
set of all non-positive vectors is a closed cone. If V is indefinite, then the set 
consisting of o and all negative vectors is an open cone with the set of all null 
vectors as its boundary. 

Let V be a non-degenerate vector space which is not definite, that is to say, 
contains a non-positive vector v # 0. Then the set, C, of all non-positive 
vectors is a cone since: (a) 0 is non-positive and, by assumption, V contains at 
least one vector ~ 0; (b) every positive multiple of a non-positive vector is 
non-positive; (c) the convexity condition is satisfied by virtue of Lemma 1 
of Section 5; condition (d) is satisfied because, by assumption, V is non- 
degenerate. The cone C is closed since, by virtue of the continuity of the 
norm, the set of all positive vectors is open. 

Now let V be indefinite, that is, contain a negative vector. Then the set 
consisting of o and all negative vectors satisfies: conditions (a) and (6); 
moreover, by virtue of Lemma 2 of Section 5, condition (c); condition (d) 
since the opposite of a negative vector is positive. C is open by virtue of the 
continuity of the norm. Each null vector, v, belongs to the boundary of the 
open cone. For if x is a negative vector, then, for every positive a which is 
< 1, the vector v + a(x — 2) is negative. Hence v belongs to the boundary 
of the cone. 




















es S&S o 


a - 




















GENERALIZED VECTOR SPACES 101 


8. THe DECOMPOSITION OF FINITE-DIMENSIONAL SPACES 


Lemma. If V* is a finite-dimensional definite subspace and P a plane which is 
independent of V* and such that W = V* + P is non-degenerate, then W 
contains a vector, w', such that V* + [w’] is definite. If V* = O*, the Lemma 
contends that every non-degenerate plane contains a definite line. 

From the definiteness of V* we deduce the following 

REMARK A. If z is a non-positive vector of W, then for every vector v* of V* 
and every number a > 0, the vector v* — az is positive. 

For if y = v* —az were non-positive, then, since W is non-degenerate, y and 
z would be independent, and by Lemma 1, every vector of the first quadrant 
(y,z) would be non-positive whereas v* = y + az is positive. 

From the finite dimensionality of V* (and thus of W) we deduce a Remark 
B concerning a property of a particular subset, S, of W. Only this single 
consequence of the assumption that V* has a finite dimension, say k — 2, 


will be used in the proof. If x;,..., x, are independent vectors of W, let S 
dencte the set of all vectors ayx; + ... + ax, for which Za? = 1. 
REMARK B. Every sequence s;, S2,... of vectors of S contains a subsequence 

Sis» Sig ---for which a vector, s, of S exists such that 

lim | s;, | :|s\. 

n> oo 
If, in particular, the kth components of the vectors s,,52,... converge to 0, then 
s can be so chosen that its last component is 0. If the (k — 1)th components 
of all vectors $1,52,...are positive, then s can be so chosen that the (k — 1)th 


component of s is non-negative. ; 
By virtue of the compactness of the sphere Za? = 1 in the k-dimensional 
Cartesian space of the (a;,..., a), the s;, can be so selected that if 


S — Si, = Gnsi%1 +... + GnybXb, 


n 
then as n> @ 

lim @a,1 = lim ag,2 = ... = lim ag,z = O. 
Hence Remark B is a consequence of the finite-dimensional continuity of 
the norm. 

If every vector of W is positive, then the proposition of the Lemma holds. 
We thus can assume that W contains a non-positive vector, v. Its opposite, 
the vector — v, is positive. By Remark A, for every v* of V* and every 
a > 0, the vector v* — av is positive. W contains a vector, w, such that [2] 
and V* + [v2] are independent. Now we prove: 

There exists a 8 = O with the following property P. If a> 8, then for 
every v* of V*, the vector v* — av + w is positive. 

We assume that no number 8 2 0 has the property P and derive a contra- 
diction. By the assumption, for every m there exists a number a, > m and 
a vector, v*, of V* such that 

Yn = U*n — and + w 
is non-positive. Let x,,...,%—2 be independent vectors of V* such that 
Un = Aniki +... + Gnye—oXe-2- 











102 KARL MENGER 


We set 
te-1 = —v and Gnsk—1 = Gn, 


x, = wand aa, = 1, 
[ 2(an,<)*}’ = Va > 0 
Since a, > n we have lim y, = ©. If we set 


1 


Sa =— Yas 
Yn 
then by Remark B there exists a subsequence s;,, s;,,...and a vector s of 
S such that 
lim | si, | =| s\. 


Since lim y, = © and a,,4 = 1 and a,,.—-1 > 0 for every nm, we see that s 
can be chosen as a vector of the form v* — av where o* is in V* and a 2 0. 
Thus s is positive. This is a contradiction since the s,, and, in particular, 
the s;, are non-positive. (The y, are non-positive, the y, > 0). 


Having established the existence of a 6 2 0 with the property P, we call 
Bo the g.l.b. of all 8 = 0 with the property P. Two cases are possible. 


First Case. 8B) > 0. Then from the definition of 8» it follows! that there 
exists a non-positive vector (and thus clearly a null vector) wo = v*, — Bo 
+ w where v*, belongs to V*. Now if for a vector v* of V* and «x > O the 
vector w’, = v* — (x — Bo)v — w were non-positive, then since V* + P is 
non-degenerate, the vectors w») and w’ would be independent, and all vectors 
of the first quadrant (wo, w’) would be non-positive. But this is not the 
case since wy, + w’, that is, (v*>» + v*) — «xv is positive. Hence all vectors 
w’, are positive. Since wo is non-positive, by Remark A, for every vector 
v* of V* and every a > 0, the vector 


a € v* — os) + aS — aw 


a 


is positive. Hence for every v*, of V*, the vector v*; + Bo v — w is positive. 
In particular, — v*> + Bo v — w is positive. Since its opposite, the vector 
Wo, is non-positive, by Remark A all vectors v* — aw» or, which is equivalent, 
all vectors 
v* + a (Bq — w) for v* in V* anda > 0 
are positive. From this fact it follows [exactly as from the positivity of all 
vectors v* — av we derived the existence of a 8 2 0 with the property P] 
that there exists ay = 0 such that for each v* of V* and every a > y the vector 
y.="* +a (Bw —w)+w 

is positive. No matter how we determine a > y, for every x > 0 the vectors 


alv* + a(Bw — w) + w) 


‘If we use an argument similar to the one which yielded the existence of s. 

















GENERALIZED VECTOR SPACES 103 


or, which is equivalent, the vectors 
v* + x[a(Bw — w) + w] 


are positive for every v* of V*. Now if a > 1, thea for every v* of V* the 
vector 





ot — a(80 — w) —w=(e@-1)| *+w-—* se 
a—l a—l 
is positive by definition of Bo. Hence every vector 


v* — xla(Bw — w) + w] 
is positive. Thus if we set, for instance, 
w’ = (2+ y) (Go — w) +w 
then v* + aw’ is positive for every v* in V* and every a, that is to say, the 
subspace V* + [w’] is definite. 

SeconD CAsE. 8) = 0. Then we first let w and v play the roles of — v 

and w, respectively, and arrive at a yo such that for a > yo 
v* + aw 
is positive while one vector 
w, = v* + yow 
is non-positive (thus obviously null). We let now this vector play the role 
of we and exactly as before arrive at a vector w” such that V* + [w’’] is 
definite. This completes the proof of the Lemma. 

Now let V be a finite-dimensional non-degenerate vector space. If V isa 
plane, then, by the Lemma, V contains a definite line. By induction we build 
an increasing chain of definite subspaces which, by the Lemma, can be con- 
tinued as long as there exists a plane which is independent of the subspace. 
This is the case until we have reached a definite subspace, V’, whose dimension 
is that of V minus 1. If V itself is not definite we can represent V as the 
sum of V’ and the line of any vector not contained in V’. As such a vector 
we may use any non-positive vector v’. The opposite of v’ is positive. We 
thus have proved the following 

THEOREM 3. Every finite-dimensional non-degenerate space which is not 
definite is the sum of a definite subspace and the v'-line of any non-positive 
vector v' which is # o. 

Since every indefinite vector space obviously contains a nullvector besides 
o, and in a non-degenerate space the line of such a vector is semi-definite, 
we obtain the following 

CoroOLLARY. Each non-degenerate finite-dimensional vector space which is 
non definite is the sum of a definite subspace and a single semi-definite line. 

Combining Theorems 1 and 3 we can say 

Each finite-dimensional vector space, V, can be represented as the sum of the 
following parts: 

(1) a uniquely determined totally degenerate subspace (which is O* if and 
only if V is non-degenerate) ; 
(2) a definite subspace (which may be O*); to which, if and only if V con- 








104 KARL MENGER 


tains a non-positive vector whose opposite is positive, we have to add a third sub- 
space, namely, 

(3) a single line for which we may choose the v-line of any non-positive 
vector. 


9. A SuRVEY OF ALL FINITE-DIMENSIONAL VECTOR SPACES 


The space O*, and only this space, may be considered as 0-dimensional. 
There are four types of 1-dimensional spaces (‘lines’): definite, semi-definite 
and indefinite lines (which are non-degenerate) and totally degenerate lines. 
By induction we see that there are the following types of n-dimensional 
spaces: 

1. Non-Degenerate Spaces. 

1. Definite Spaces. 

2. Semi-Definite Spaces. We shall classify them by calling the dimension 
of the closed cone of null vectors of a space, its degree. 

3. Indefinite Spaces with an n-dimensional cone of negative vectors whose 
boundary consists of the null vectors. 

Il. Degenerate Spaces which are the sum of a non-degenerate space of a 
dimension < mn (which we shall call the rank of the space) and a totally 
degenerate space. 

In subsequent papers we shall refer to the definite, semi-definite and 
indefinite spaces as elliptic, parabolic, and hyperbolic spaces, respectively. 
The parabolic spaces of minimum degree, 1, will be called properly parabolic. 
We shall see that in a finite-dimensional vector space every closed cone, for 
a properly chosen definition of the norm, is the cone of the null vectors of a 
parabolic space; and that every open cone may be th cone of the negative 
vectors of an indefinite space. Hence every hyperbolic space can be made 
parabolic of maximum degree by redefining the norm of each negative vector 
as 0, while every parabolic space of maximum degree can be made hyperbolic 
by associating with the interior null vectors proper negative norms. 


REFERENCES 


{1)S. Banach, Fund. Math., vol. 3 (1922), 133-181; H. Hahn, Monatshefte Math. Phys., vol. 
32 (1922), 1-81; N. Wiener, Bull. Soc. Math. France, vol. 150 (1922), 124-134. 

[2)Cf., in particular, “Die metrische Methode in der Variationsrechnung,” I 
mathem. Kolloquiums, vol. 8 (1937), 1-32; two notes in the Proc. Nat. Acad. Scé., 
23 (1937), 246 and vol. 25 (1939), 474; and the lecture ‘Analysis and Metric Gonmeuy” 
in The Rice Institute Pamphlet, vol. 27 (1940), 1-40. 

[3|Ergebn. mathem. Kolloquiums, vol. 8 (1937), 32. 

[4)Cf., in particular, Pauc’s résumé in the pamphlet ‘“‘Les méthodes directes en Calcul des 
Variations,” (Paris, Herman, 1941). 

[5) Rend. della Acc. Naz. Linc., vol. 26 (193 

(6)Ergebn. math. Kolloquiums, vol. 8 (1987), 95 and Alt, loc. cit. 


Illinois Institute of Technology 
Chicago 

















GROUPS WITH REPRESENTATIONS OF BOUNDED 
DEGREE 


IRVING KAPLANSKY 


1. Introduction. Let G be a compact group. According to the cele- 
brated theorem of Peter-Wey] there exists a complete set of finite-dimensional 
irreducible unitary representations of G, the completeness meaning that for 
any group element other than the identity there is a representation sending it 
into a matrix other than the unit matrix. If G is commutative, the repre- 
sentations are necessarily one-dimensional. It is an immediate consequence 
of the Peter-Weyl theorem that the converse also holds: if every representation 
is one-dimensional, G is commutative. The main theorem in the present paper 
is a generalization of this result to the case where the representations have 
bounded degree. We may illustrate by stating the next simplest case. The 
representations are one- or two-dimensional if and only if G satisfies the fol- 
lowing condition: for any 4 elements of G the 12 (= 4!/2) products obtained 
from even permutations can be paired off in equal pairs with the 12 products 
obtained from odd permutations. The general result is stated in Theorem 3. 

Such groups exist: for example, the group extension of an abelian group by 
a finite group (Theorem 1). On the other hand, if such a group is connected 
it is abelian (Theorem 2). 

In §§ 2, 3 we present some preliminary remarks on matrices and groups, 
and in § 4 we review some facts on group representations needed for the exten- 
sion from the compact to the locally compact case. In § 5 the main theorems 
appear, and in § 6 a connection with a theorem due to Halmos is described. 


2. Matrix identities.' For elements x, ..., x, in a ring we shall write 
[x1, eees Xe) = b + Xe(1)- » +» g(r) 


where the sum runs over all permutations # and the plus or minus sign is pre- 
fixed according as x is even or odd. 


LemMA 1. In any algebra A of order k — 1 we have {x;,... , xx] = 0 for all 
x eA. 

Proof. Since the relation in question is multilinear, it need only be proved 
when x, ...,x, are basiselements. In that case at least one repetition occurs, 
and consequently a transposition can be performed which leaves [x;,... , xx] 
unchanged. Hence 

[x1,..-, Xe) = — [xa,..., Xa) = 0. 


Received February 23, 1948. 
1] am greatly indebted to E. R. Kolchin for the contents of 2. 


105 








106 IRVING KAPLANSKY 


(Formally this argument is invalid for characteristic 2, but the result is still 
correct and may be proved by the usual device of a reduction mod 2.) 

We may apply Lemma | to the special case where A is the algebra of » by n 
matrices over a field. We shall write r(m) for the smallest integer such that 
[x1,... ,X-] = O for all » by 2 matrices.2 By Lemma 1 we have r(n) S n*+ 1. 

The following argument gives a lower bound for r. Write t = r(m — 1)— 1. 
Suppose x;,..., x: are m — 1 by m — 1 matrices with [x,, ... , x:] #0, and we 
may suppose to be explicit that [x,, ..., x:] contains a non-zero term in éax, 
where {¢} denote the usual matrix units. Embed the matrix x; in an » by n 
matrix y,; by adjoining a row and column of zeros. Then it is evident that 


[y1, aoe 9 yt, Crn, Saal a 0. 
This proves the following result. 


LemMMA 2. r(n) 2 r(n — 1)+ 2. 

It is clear that r(1) = 2 and by Lemma 2 we deduce the lower bound r(n)= 
2n. For n = 2, r(2) is in fact precisely 4. This apparently exhausts the 
known facts concerning r(m). 


3. A certain class of groups. Let us say that a group G satisfies the con- 
dition P,,(m = 2) if the following is true: for any elements in G the set of n!/2 
products obtained from even permutations coincides with the m!/2 products 
obtained from odd permutations. It should be noted that it is not asserted 
that there is a fixed way of carrying out the pairing once for all; the particular 
correspondence presumably depends upon the particular m elements in question. 

It is fairly evident that P; implies P;,,:. P: simply asserts commutativity, 
and so does P; as can be seen by taking one of the three elements to be the 
identity. Starting at k = 4 there exist non-abelian groups satisfying P,; 
for example, the symmetric group on three elements satisfies P,. The following 
theorem provides us with a substantial class of such groups. 


THEOREM 1. A group extension of an abelian group by a finile group of order 
n satisfies P,?,1. 

Proof. We suppose that G is abelian, H of order n, and K/G & H. Choose 
fixed representatives k;,..., kneK for the cosets of K mod G. Every element 
of K can be uniquely written gk;, geG. Let b be a product of n?+ 1 elements 
of K. Insuch a product some k, say k;, must be repeated at least m + 1 times. 
Let gk, be the element appearing at the first occurrence of k;, and g’k; one of 
the later occurrences. Write x for the product of the k’s intervening between 
these two instances of k;. The interchange of the pair gk; and g’k, will leave 
b unchanged provided that kx lies in G. Since we have n + 1 or more occur- 
rences of k, and only m cosets of K mod G, it will have to happen at least once 
that an interchange of two of the terms comprising } leaves 6 unchanged. 





*It is conceivable that r(m) depends on the coefficient field, or rather on the characteristic 


of the latter. To be explicit, one may take the characteristic 0 case throughout the paper. 




















REPRESENTATIONS OF BOUNDED DEGREE 107 


We now specifically pick out the first element in the product 6 whose inter- 
change with a later element is legal. In all the (m?+ 1)! permutations we do 
the same thing, and thus set up a one-to-one correspondence between the even 
permutations and the odd permutations. This proves Theorem 1. 

Theorem 1 does not give the best possible result. Indeed we shall show 
below that a group extension of an abelian group by a group of order n actually 
satisfies P.;»), where 
(1) s(n) = r(n) for r(m) even 

= r(n)+ 1 for r(n) odd, 
and r(m) is the integer defined in §2. Thus for = 2 we get P, instead of the 
P,of Theorem 1. However I am unable to prove this refinement without the 
detour to group representations and Banach algebras. 

We shall conclude this section by showing that there are no connected non- 
abelian groups of the kind under discussion. Actually we prove a (formally 
at least) stronger result, in order to carry through an induction. 


THEOREM 2. Let G be a connected topological group having for some fixed 
n = 2 the following property: any product ay. . . a, is equal to a proper permu- 
tation. Then G is abelian. 

Proof. We shall show that G has the same property for m — 1 and hence 
finally reach m = 2. Let then a;,...,@,—-; be elements in G. For any 6 in G 
the product @,. . . d,1.5 must be equal to a proper permutation. We may sup- 
pose that there is a neighbourhood U of the identity such that for b in U the 


proper permutation in question keeps the order of a, . . . , @n—1 fixed; for other- 
wise we can take the limit as 6 approaches the identity and conclude that 
ay... @,—; equals a proper permutation. Thus for each } in U we have one 
of the » — 1 possible equations 

G1... Gn—10 = Qh. . . 00541... Gn—-ilt = 0,..., n — 2). 
The #th equation asserts that b commutes with a;,;. . . @,—, and hence is valid 


in a closed set. Thus U is covered by a finite number of these closed sets, and 

one of them must have a non-void interior. This says that the centralizer of 

Gi41--.@n—1 is open. Since G is connected, this centralizer must be all of G 

and hence ¢;4:. .. @n—. isin the centre. Fori 2 1 this yields the desired result 

obviously, while for i = 0 the assertion that a. . . a,—, is in the centre implies 
G02. . . Gn-1 = Ge. . . Gn—101. 


CorROLutary. If a connected topological group satisfies P,, it is abelian. 


4. Group representations. In order to formulate our main theorem for a 
locally compact group G, it would not suffice to assume that the finite-dimen- 
sional irreducible unitary representations of G have bounded degree; for there 
exist groups (e.g. the Lorentz group) for which the only finite-dimensional 
unitary representation is the trivial one. Thus we must impose a further 
condition which will entail the existence of a respectable number of finite- 
dimensional representations. For our purposes a convenient hypothesis of 








108 IRVING KAPLANSKY 

this kind can be formulated in terms of the representations introduced by 

Segal [4]. We devote this section to a brief statement of the necessary facts. 
Let A denote the L,-algebra of the locally compact group G, that is, the 

algebra of all complex-valued functions summable with respect to the left 

Haar measure of G, with convolution as multiplication: 


fg(x) =| foneos)dy. 


Let E be the algebra of bounded operators on a Banach space. A B-repre- 
sentation [4, p. 79] of G is a multiplicative homomorphism T of G into E which 
sends the identity into the unit operator, is continuous in the strong topology 
of E, and is such that ||T(a)|| is bounded for a eG. A B-representation is 
irreducible if it admits no proper closed invariant subspaces. Irreducible B- 
representations may be constructed as follows. Let M be a regular maximal 
left ideal in A, and associate with a «G the operator T7,:u + K—u,+ K, 
where u,(x)= u(a~'x). We shall call these representations primitive, a desig- 
nation suggested by the fact that the extension of the representation to A 
has as its kernel the ideal P consisting of all x with xA S M; P isa primitive 
ideal in the sense of Jacobson [2]. Conversely every primitive ideal in A is 
associated in this fashion with at least one primitive representation of G. 
The following facts are known: (1) all primitive representations are irre- 
ducible, (2) any irreducible finite-dimensional unitary representation is similar 
to a primitive representation, (3) if G is compact or abelian, all primitive 
representations of G are finite-dimensional. It is an open question whether 
every irreducible B-representation is similar to a primitive representation. 


5. Main theorem. In terms of the concepts introduced in the previous 
sections, the principal result can be stated as follows. 


THEOREM 3. The following two statements are equivalent for a unimodular 
locally compact group G3 

(a) All primitive representations of G are finite-dimensional and of degree at 
most n, 

(b) G satisfies the condition P ..n), where s(n) is defined by (1). 

It is to be observed that if G is compact, the theorem simplifies percep- 
tibly: compact groups are unimodular, and their primitive representations are 
automatically finite-dimensional. 

Proof. Suppose that (a) holds. Then it follows virtually from the defini- 
tion of the primitive representations that for every primitive ideal Pin A = 
L,(G), A — P is finite-dimensional and is in fact a total matrix algebra of 
degree at most m. Hence A — P satisfies the identity [x,,..., x,] = 0 for 
k = r(n) and a fortiori fork = s = s(m). Now the intersection of the primitive 
ideals of A is 0: this is a consequence of the semi-simplicity of A: [4, Th. 1.5] 


%A group is unimodular if its right and left Haar measures coincide—cf. [5, p. 39). 








or 




















REPRESENTATIONS OF BOUNDED DEGREE 109 


and [2, Th. 25]. Hence [f;,..., f.] = 0 holds for all fA. The s-fold con- 
volution of functions is given by 
(2) fu . SAB) 


-| eee | ponpon. «+ fa—s(¥e—a)f (M1. « - Yo-1) x] dyn. . . dyy—1. 


We shall now study the effect of a permutation x on f;...f,. If « does not 
involve the letter s, its effect on (2) may be described as carrying out x on the 
y's in (y:. . . y.-1)~', and otherwise leaving the right side of (2) unchanged. 
Next we try the case  =(is). We carry out the interchange of f; and f, in (2) 
and then replace y; by 


(3) (ya. - « Ye—a) ey (Yigee - - Yo—1)™, 
(a legal change of variable in view of the assumed unimodularity of G). This 
replaces 
(4) (Wi. - - Yo—1) 
by y, and so finally gives us the right side of (2) changed by the substitution 
of (3) for (4). In view of the fact that s is even, it can be verified that the 
permutation (4) — (3) is odd. 

The general permutation x which does involve s can be written uniquely as 

=(is):, where 7; is independent of s. The effect of x on the right side of 
(2) can thus be described as changing the argument of f, by the permutation 
(4) — \3), followed by the permutation x, on the y's. This is a one-to-one 
correspondence: given the induced permutation on the argument of f, we can 
unambiguously reconstruct +; for the position of x (in the ith place) gives 
us the portion (is), and the position of the y’s then determines x;. Moreover, 
the correspondence preserves the parity of +, as we have seen. 

We may summarize as follows. We have 


(5) | oe | ponnon woe fe-s(y 1) Zdy:. .. dy,.1= 0, 


where we have written Z for 
Z= p> + f,(2;), 
J 


2; being the general permutation of (4), and the plus or minus sign being taken 
according to the parity of the permutation 2;. Since (5) holds for all f; in A 
and in particular for all continuous f; in A we deduce that Z = 0 for all con- 
tinuous f, in A. Since a continuous function in A can take arbitrary values 
at any finite subset of G, we conclude that G must satisfy the condition P,. 
We now proceed to the proof of the converse. Suppose that (6) holds. Then 
the computation above is reversible to the point where we have [f;, . . . , f.] = 0 
for fA. Let P be a primitive ideal in A; the identity [x,,..., x,] = 0 is of 
course inherited by A — P. Theorem | of [3] asserts that a primitive algebra 
satisfying a polynomial identity is finite-dimensional over its center. Hence 
A — P is finite-dimensional over its centre C. By the Gelfand-Mazur theorem 
on normed fields, C is just the complex numbers. Hence A — P is an algebra 











110 IRVING KAPLANSKY 


of finite order over the complex numbers, and is indeed a full matrix algebra. 
As for the degree of these matrices, it cannot exceed n; for by Lemma 2, 
s Sr(n)+ 1 < r(m + 1) and consequently matrices of degree n + 1 fail to 
satisfy [x,,...,2x,] = 0. This shows that the primitive representations of G 
are finite-dimensional and of degree at most m, and concludes the proof of 
Theorem 3. 

The criterion provided by Theorem 3 is in many cases easy to apply. For 
example, let G and H be unimodular locally compact groups whose primitive 
representations have bounded degree; then from Theorem 3 it follows that the 
same is true for GX H, any unimodular homomorphic image of G, and any closed 
unimodular subgroup of G. Also the following result is a corollary of Theorems 
2 and 3. 


Coro.iary. Let G be a connected unimodular locally compact group whose 
primitive representations are finite-dimensional and of bounded degree. Then G 
is abelian. 


This corollary may be derived in another way which we shall now describe. 
Let G be a connected locally compact group which is maximally almost- 
periodic, that is, G has a complete set of finite-dimensional unitary representa- 
tions. (This hypothesis is weaker than the assumption that the primitive repre- 
sentations of G are finite-dimensional.) By a theorem of Freudenthal [5, p. 129], 
G is the direct product of a compact group and a finite number of copies of 
the additive group of real numbers. The question as to when the irreducible 
unitary representations are of bounded degree is thereby reduced to the com- 
pact case; and by considering the images under the representations, we further 
reduce to the case of a compact Lie group. In fact, our problem becomes pre- 
cisely the following: prove that a connected compact simple Lie group possesses 
irreducible representations of arbitrarily high degree. That this is in fact the 
case follows from known classical results. 

The corresponding theorem for Lie algebras asserts that a simple Lie algebra 
has irreducible representations of arbitrarily high degree. In this form, the 
theorem has recently been given a purely algebraic proof by Harish-Chandra 
[6]. It is perhaps worth remarking that, by standard devices, the theorem on 
Lie algebras can conversely be derived from the group theorem. 

We return to the study of the group K of Theorem 1, and shall derive the 
purely group-theoretic fact that K satisfies P.:,). We give K the discrete 
topology, which assures its local compactness and unimodularity. Then by 
Theorems 1 and 3 we have that the primitive representations are finite-dimen- 
sional and of bounded degree. Theorems 1 and 3 also yield a bound for the 
degree in question, but a better bound can be obtained by a simple direct 
argument. In fact we assert that any finite-dimensional irreducible unitary 
representation T of K is of degree at most m. .For the induced representation 
of G decomposes into one-dimensional representations, since G is abelian. 
Let a be a non-zero vector in one of these G-invariant one-dimensional sub- 
spaces. Then for geG, aT(g) is a multiple of a. Using the notation of the 














REPRESENTATIONS OF BOUNDED DEGREE 111 


proof of Theorem 1, we deduce that the invariant subspace generated by a is 
spanned by a7(k;), . .. , a7 (k,) and hence is at most n-dimensional, as desired. 
Quotation of Theorem 3 proves that K satisfies P 4p). 

We shall conclude this section with a variant of Theorem 3: 


THEOREM 4. The following two statements are equivalent for a unimodular 
locally compact group G: 

(a) The primitive representations of G are finite-dimensional and of bounded 
degree. 

(b) The L,-algebra of G satisfies a polynomial identity. 


Proof. The proof coincides with the corresponding portions of the proof of 
Theorem 3, except for the following remark. In proving that (6) implies (a) 
we take a primitive ideal P in A = L,(G) and quote Theorem 1 of [3] to sustain 
the claim that A — P is finite-dimensional. But more than that: Theorem | 
of [3] shows that the dimension of A—P has a fixed upper bound depending 
only on the polynomial identity in question (cf. remark (6) on p. 580 of [3)). 
The rest of the proof proceeds unchanged. 


6. A theorem of Halmos. The study of groups with bounded represen- 
tations arose in connection with an attempt to generalize a theorem of Halmos, 
which we shall now describe. Let G be a compact group and S a continuous 
automorphism of G. The uniqueness of Haar measure shows that S induces a 
measure preserving transformation on G, which in turn induces a unitary 
operator U: f > f° on L.(G). We say that S is ergodic if the only solutions of 
f° =f are constant. In [1] Halmos studied the case where G is commutative, 
and showed (Th. 3) that if S is ergodic, the spectral type of U is entirely deter- 
mined by the cardinal number of the character group of G. We refer the reader 
to [1] for the precise result. 

Now let G be a compact group which is not necessarily commutative. The 
automorphism S induces in a natural way a permutation zs of the irreducible 
representations of G. This permutation leaves the trivial representation y 
fixed (y sends every element into the matrix (1)). The analogue of Halmos’s 
{1, Th. 1] is now valid: S is ergodic if and only if ws has no finite orbits other 
than y. The proof is virtually the same as that given by Halmos: one uses 
the coordinates of irreducible representations in place of characters. 

Supposing that S is ergodic, we can now proceed to discuss the spectral type 
of U. Of course U(~¥)= ¥. By appropriate choice of the remaining coordin- 
ates of irreducible representations, which together with ¥ form an orthonormal 
base of L2(G), we can arrange them in a double array ¢;,; such that U(¢@,;,;) = 
U(¢i, 341). Here the index j runs over all integers, and the index i over the 
orbits of ws. If we let c denote the number of orbits in question, we have 
proved Halmos’s [1, Th. 3] except for the assertion that c is infinite. If the 


‘If there are an uncountable number of irreducible representations, it is clear that c is infinite 
(and equal to that number). Thus further discussion is really needed only for the case of a 
countable number of irreducible representations. 





112 IRVING KAPLANSKY 


representations have unbounded degree, then it is clear that c is infinite, for 
the permutation ws necessarily preserves degree. At the other extreme, if all 
the representations are one-dimensional (G commutative), Halmos provided 
a group-theoretic argument on the character group to show that c is infinite 
{1, Th. 2]. There remains the case of representations of bounded degree, where 
it would be necessary to generalize suitably Halmos’s argument. I have been 
unable to supply such an argument, but possibly the results in this paper will 
point the way toward the completion of this problem. 


Postscript (December 1, 1948) 


Since this manuscript was completed, a paper by F. W. Levi has appeared: 
“On Skew Fields of a Given Degree,” J. Indian Math. Soc., vol. II (1947), 
85-88. Reference is made there to a paper to be published in the Mathe- 
matische Annalen. In the notation of §2, this latter paper proves (among 
other things) that r () is even and r (3) =6. The distinction between r (mn) 
and s (m) may therefore be suppressed. 


REFERENCES 


{1} P. R. Halmos, “On Automorphisms of Compact Groups,”” Bull. Amer. Math. Soc., vol. 
49 (1943), 619-624. 

(2] N. Jacobson, “The Radical and Semi-simplicity for Arbitrary Rings,” Amer. J. of Math., 
vol. 67 (1945), 300-320. 

{3] I. Kaplansky, “Rings with a Polynomial Identity,” Bull. Amer. Math. Soc., vol. 54 
(1948), 575-580. 

[4] I. E. Segal, ““The Group Algebra of a Locally Compact Group,” Trans. Amer. Math. Soc., 
vol. 61 (1947), 69-105. 

(5) A. Weil, “L’Intégration dans les groupes topologiques et ses applications,”” Actualités 
Scientifiques et Industrielles, no. 869, Paris, 1938. 

(6] Harish-Chandra, “On Representations of Lie Algebras,” to appear in the Annals of 
Mathematics. 


University of Chicago 





